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BRIEF ON APPEAL 

Sir: 

Further to the Notice of Appeal filed July 25, 2002, and received by the USPTO on August 2, 
2002, herewith are three copies of Appellants' Brief on Appeal. Authorized fees include the $ 320.00 
fee for the filing of this Brief. 

This is an appeal from the decision of the Examiner finally rejecting Claims 12-17 of the above- 
identified application. 

(1) REAL PARTY IN INTEREST 

The above-identified application is assigned of record to Incyte Pharmaceuticals, Inc. (now 

Incyte Genomics, Inc.) (Reel 7659, Frame 0629) which is the real party in interest herein. 
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(2) RELATED APPEALS AND INTERFERENCES 
Appellants, their legal representative and the assignee are not aware of any related appeals or 
interferences which will directly affect or be directly affected by or have a bearing on the Board's 
decision in the instant appeal. 



Claims rejected: 
Claims allowed: 
Claims canceled: 
Claims withdrawn: 
Claims on Appeal: 



(3) STATUS OF THE CLAIMS 
Claims 12-17 
(none) 
Claims 1-11 
Claims 18-22 

Claims 12-17 (A copy of the claims on appeal, as amended, can be 
found in the attached Appendix). 



(4) STATUS OF AMENDMENTS AFTER FINAL 
There were no amendments submitted after Final Rejection. 



(5) SUMMARY OF THE INVENTION 
Appellants' invention is directed, inter alia, to polynucleotides encoding polypeptides having 
strong homology to canine C5a anaphylatoxin receptor ( Perret, J .J., et al., (1992) Biochem. J., 
288:911-917) ("CALR") and compositions containing them, which have a variety of utilities, in the 
diagnosis of conditions or diseases characterized by expression of CALR and for drug discovery (see 
the Specification at, e.g., page 6, line 15 through page 7, line 1; page 9, lines 13-22). As described in 
the Specification: 

The novel C5a-like receptor (CALR) which is the subject of this patent 
application was identified among the cDNAs derived from a mast cell library. Incyte 
Clone No. 81 18 is a novel nucleotide sequence which is more closely related to 
CFCOMC5AM, the C5a anaphylatoxin receptor from dog (Perret JJ et al (1992) 
Biochem J 288:91 1-17) than to the known human C5a receptor. (Specification, page 
2, lines 8-12.) 
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************************************************** 

The present invention provides a unique nucleotide sequence identifying a novel 
C5a-like receptor which was first identified in human mast cells. The sequence for calr 
is shown in SEQ ID No 1 and is homologous to the GenBank sequence, 
CFCOMC5 AM for canine C5a anaphylatoxin receptor. Incyte 8 1 1 8 has 45% amino 
acid identity with the C5a receptor and differs from it in having only three carboxylate 
residues in the N-terminus, two of which are Glu rather than Asp. In addition, the 
N-terminus of Incyte 8118 is shorter than that of the published C5a receptor and would 
be expected to have different binding specificity. 

Because CALR is specifically expressed in cells active in immunity, the nucleic 
acid (calr), polypeptide (CALR) and antibodies to CALR are useful in investigations of 
and interventions in the normal and abnormal physiologic and pathologic processes 
which comprise the mast cell's role in immunity. Therefore, an assay for upregulated 
expression of CALR can accelerate diagnosis and proper treatment of conditions 
caused by abnormal signal transduction due to anaphylactic or hypersensitive 
responses, systemic and local infections, traumatic and other tissue damage, hereditary 
or environmental diseases associated with hypertension, carcinomas, and other 
physiologic or pathologic problems. (Specification, page 6, lines 7-22.) 

*************************************************** 



The cDNA (SEQ ID NO 1) and amino acid (SEQ ID NO 2) sequences for 
human CALR are shown in Fig 1. Incyte's calr produced a BLAST score of 412 
when compared with the C5a receptor sequence and has a probability of 1.8" 50 that the 
sequence similarity occured by chance. This calr homolog also resembles various 
N-formylpeptide receptors generating BLAST scores ranging from 381 to 363 with 
probabilities of 7.4" 46 to 3.2' 43 . When the translation of CALR was searched against 
protein databases such as SwissProt and PIR, no exact matches were found. Fig 2 
shows the comparison of the human calr sequence with that of the dog C5a receptor, 
CFOMC5AM. (Specification, page 16, line 29 through page 17, line 1.) 



(6) THE FINAL REJECTIONS 
Claims 12-17 stand rejected under 35 U.S.C. §§ 101 and 112, first paragraph, based on the 
allegation that the claimed invention lacks patentable utility. The rejection alleges in particular that the 
invention has "no apparent or disclosed specific and substantial credible utility." (Final Office Action, 
page 3.) 
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(7) ISSUES 

1. Whether Claims 12-17 directed to a polynucleotide sequence encoding a C5a-like 
receptor meet the utility requirement of 35 U.S.C. §101. 

2. Whether one of ordinary skill in the art would know how to use the claimed 
polynucleotide, e.g., in toxicology testing, drug development, and the diagnosis of disease, so as to 
satisfy the enablement requirement of 35 U.S.C. §112, first paragraph. 

(8) GROUPING OF THE CLAIMS 

As to Issue 1 

All of the claims on appeal are grouped together. 
As to Issue 2 

All of the claims on appeal are grouped together. 

(9) APPELLANTS' ARGUMENTS 

The rejection of Claims 12-17 is improper, as the inventions of those claims have a 
patentable utility as set forth in the instant specification, and/or a utility well known to one of 
ordinary skill in the art. 

The invention at issue is a polynucleotide corresponding to a gene that is expressed in a human 
mast cell library established from the peripheral blood of a patient with mast cell leukemia. The novel 
polynucleotide codes for a polypeptide demonstrated in the patent specification to be a member of the 
class of C5a-like seven transmembrane receptors, whose biological functions include binding 
complement and activating the immune function of mast cells. (Specification, e.g., at page 1, line 3 
through page 3, line 16; page 6, lines 7-14; page 16, line 29 through page 17, line 1.) As such, the 
claimed invention has numerous practical, beneficial uses in drug development, and the diagnosis of 
disease. As a result of the benefits of these uses, the claimed invention already enjoys significant 
commercial success. 

The Patent Examiner contends that the claimed polynucleotide cannot be useful without precise 
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knowledge of its biological function. But the law never has required knowledge of biological function to 
prove utility. It is the claimed invention's uses, not its functions, that are the subject of a proper analysis 
under the utility requirement. 

L The Applicable Legal Standard 

To meet the utility requirement of sections 101 and 1 12 of the Patent Act, the patent applicant 

need only show that the claimed invention is "practically useful," Anderson v. Natta, 480 F.2d 1392, 

1397, 178 USPQ 458 (CCPA 1973) and confers a "specific benefit" on the public. Brenner v. 

Manson, 383 U.S. 519, 534-35, 148 USPQ 689 (1966). As discussed in a recent Court of Appeals 

for the Federal Circuit case, this threshold is not high: 

An invention is "useful" under section 101 if it is capable of providing some identifiable 
benefit. See Brenner v. Manson, 383 U.S. 519, 534 [148 USPQ 689] (1966); 
Brooktree Corp. v. Advanced Micro Devices, Inc., 977 F.2d 1555, 1571 [24 
USPQ2d 1401] (Fed. Cir. 1992) ("to violate Section 101 the claimed device must be 
totally incapable of achieving a useful result"); Fuller v. Berger, 120 F. 274, 275 (7th 
Cir. 1903) (test for utility is whether invention "is incapable of serving any beneficial 
end"). Juicy Whip Inc. v. Orange Bang Inc., 51 USPQ2d 1700 (Fed. Cir. 1999). 

While an asserted utility must be described with specificity, the patent applicant need not 

demonstrate utility to a certainty. In Stiftung v. Renishaw PLC, 945 F.2d 1173, 1180, 20 USPQ2d 

1094 (Fed. Cir. 1991), the United States Court of Appeals for the Federal Circuit explained: 

An invention need not be the best or only way to accomplish a certain result, and it 
need only be useful to some extent and in certain applications: "[T]he fact that an 
invention has only limited utility and is only operable in certain applications is not 
grounds for finding lack of utility." Envirotech Corp. v. Al George, Inc., 730 F.2d 
753, 762, 221 USPQ 473, 480 (Fed. Cir. 1984). 

The specificity requirement is not, therefore, an onerous one. If the asserted utility is described 
so that a person of ordinary skill in the art would understand how to use the claimed invention, it is 
sufficiently specific. See Standard Oil Co. v. Montedison, S.p.a., 212 U.S.P.Q. 327, 343 (3d Cir. 
1981). The specificity requirement is met unless the asserted utility amounts to a "nebulous expression" 
such as "biological activity" or "biological properties" that does not convey meaningful information 
about the utility of what is being claimed. Cross v. Iizuka, 753 F.2d 1040, 1048 (Fed. Cir. 1985). 



100844 



5 



08/462,355 




Docket No.: PF-0040 US 

In addition to conferring a specific benefit on the public, the benefit must also be "substantial." 
Brenner, 383 U.S. at 534. A "substantial" utility is a practical, "real-world" utility. Nelson v. Bowler, 
626 F.2d 853, 856, 206 USPQ 881 (CCPA 1980). 

If persons of ordinary skill in the art would understand that there is a "well-established" utility 
for the claimed invention, the threshold is met automatically and the applicant need not make any 
showing to demonstrate utility. Manual of Patent Examination Procedure at § 706.03(a). Only if there 
is no "well-established" utility for the claimed invention must the applicant demonstrate the practical 
benefits of the invention. Id. 

Once the patent applicant identifies a specific utility, the claimed invention is presumed to 
possess it. In re Cortright, 165 F.3d 1353, 1357, 49 USPQ2d 1464 (Fed. Cir. 1999); In re Brana, 
51 F.3d 1560, 1566; 34 USPQ2d 1436 (Fed. Cir. 1995). In that case, the Patent Office bears the 
burden of demonstrating that a person of ordinary skill in the art would reasonably doubt that the 
asserted utility could be achieved by the claimed invention. Id. To do so, the Patent Office must 
provide evidence or sound scientific reasoning. See In re hanger, 503 F.2d 1380, 1391-92, 183 
USPQ 288 (CCPA 1974). If and only if the Patent Office makes such a showing, the burden shifts to 
the applicant to provide rebuttal evidence that would convince the person of ordinary skill that there is 
sufficient proof of utility. Brana, 51 F.3d at 1566. The applicant need only prove a "substantial 
likelihood" of utility; certainty is not required. Brenner, 383 U.S. at 532. 

II. The uses of polynucleotides encoding CALR for diagnosis of conditions or diseases 
characterized by expression of CALR and for drug discovery are sufficient utilities 
under 35 U.S.C. §§ 101 and 112, first paragraph 

The claimed invention meets all of the necessary requirements for establishing a credible utility 
under the Patent Law: There are "well-established" uses for the claimed invention known to persons of 
ordinary skill in the art, and there are specific practical and beneficial uses for the invention disclosed in 
the patent application's specification. Objective evidence further corroborates the credibility of the 
asserted utilities. 
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A. The uses of polynucleotides encoding CALR for disease diagnosis are practical 
uses that confer "specific benefits" to the public 

The claimed invention has specific, substantial, real-world utility by virtue of its use in disease 
diagnosis through gene expression profiling. There is no dispute that the claimed invention is in fact a 
useful tool in hybridization analysis used to perform gene expression analysis. That is sufficient to 
establish utility for the claimed polynucleotide. 

Nowhere does the Patent Examiner address the fact that, as described on page 6, line 15 
through page 7, line 1, and page 9, lines 13-22 of the Specification, the claimed polynucleotides can be 
used as highly specific hybridization probes in, for example, northerns - probes that without question 
can be used to measure both the existence and amount of complementary RNA sequences known to 
be the expression products of the claimed polynucleotides. The claimed invention is not, in that regard, 
some random sequence whose value as a probe is speculative or would require further research to 
determine. 

Given the fact that the claimed polynucleotide is known to be expressed, its utility as a 
measuring and analyzing instrument for expression levels is as indisputable as a scale's utility for 
measuring weight. This use as a measuring tool, regardless of how the expression level data ultimately 
would be used by a person of ordinary skill in the art, by itself demonstrates that the claimed invention 
provides an identifiable, real-world benefit that meets the utility requirement. Raytheon v. Roper, 724 
F.2d 951, (Fed. Cir. 1983) (claimed invention need only meet one of its stated objectives to be useful); 
In re Cortwright, 165 F.3d 1353, 1359 (Fed. Cir. 1999) (how the invention works is irrelevant to 
utility); MPEP § 2107 ("Many research tools such as gas chromatographs, screening assays, and 
nucleotide sequencing techniques have a clear, specific, and unquestionable utility (e.g., they are useful 
in analyzing compounds )" (emphasis added)). 

Though Appellants need not so prove to demonstrate utility, there can be no reasonable dispute 
that persons of ordinary skill in the art have numerous uses for information about relative gene 
expression including, for example, understanding the effects of a potential drug for treating mast cell- 
associated immune conditions caused by abnormal signal transduction due to anaphylactic or 
hypersensitive responses, systemic and local infections, traumatic and other tissue damage, hereditary 
or environmental diseases associated with hypertension, carcinomas, and other physiologic or 
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pathologic problems. In other words, the person of ordinary skill in the art can derive more information 
about a potential mast cell-associated immune condition drug candidate or potential toxin with the 
claimed invention than without it. 

B. The use of nucleic acids coding for proteins expressed by humans as tools for 
drug discovery and the diagnosis of disease is now "well-established" 

The technologies made possible by expression profiling and the DNA tools upon which they 
rely are now well-established. The technical literature recognizes not only the prevalence of these 
technologies, but also their unprecedented advantages in drug development, testing and safety 
assessment. 

Perret J.J. et al. (1992; Biochem J. 288:91 1-17; IDS Reference No. 6, incorporated by 
reference into the instant application; Reference No. 1) state that "[ultimately, the availability of the 
cloned receptors should help the design of pharmacologically active (non-peptide) inhibitors that could 
be used in syndromes were [sic: where] inappropriate complement activation occurs." (Perret, page 
917.) The Specification discusses using the polynucleotides "in production of chimeric molecules for 
selecting agonists, inhibitors or antagonists for design of domain-specific therapeutic molecules." 
(Specification, page 6, lines 27-29.) In addition the Specification describes the use of polypeptides 
encoded by the claimed polynucleotides in drug screening, for example, page 23, line 12 through page 
24, line 14. 

Because the Patent Examiner failed to address or consider the "well-established" utilities for the 
claimed invention in drug development, and the diagnosis of disease, the Examiner's rejections should 
be overturned regardless of their merit. 

C. The similarity of the polypeptide encoded by the claimed invention to another 
polypeptide of undisputed utility, as well as the expression of the CALR 
polypeptide in human mast cells, demonstrates utility 

The Examiner alleged that "[t]he instant claims are drawn to a protein of as yet undetermined 
function or biological significance. There is absolutely no evidence of record or any line of reasoning 
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that would support a conclusion the [sic: that] a protein of the instant invention is associated in any way 
with the plurality of causally unrelated disorders that are listed on page 6 of the instant specification" 
(Office Action mailed July 25, 2001, page 3.) 

Appellants submit that there is adequate evidence in the Specification, along with what is well 
known in the art, to provide a "line of reasoning" to support the asserted utility for the claimed 
polynucleotide. This evidence is provided by not only sequence identity between CALR and canine 
C5a anaphylatoxin receptor but also the expression of CALR in human mast cells. 

The utility of the claimed polynucleotide can be imputed based on the relationship between the 
polypeptide it encodes, CALR, and another polypeptide of unquestioned utility, canine C5a 
anaphylatoxin receptor. The two polypeptides have sufficient similarities in their sequences that a 
person of ordinary skill in the art would recognize more than a reasonable probability that the 
polypeptide encoded for by the claimed invention has utility similar to canine C5a anaphylatoxin 
receptor. Appellants need not show any more to demonstrate utility. In re Brana, 51 F.3d at 1567. 

It is undisputed, and readily apparent from the patent application, that the polypeptide encoded 
for by the claimed polynucleotide shares 46% sequence identity over 152 amino acid residues (L14 
through T165 of CALR) with canine C5a anaphylatoxin receptor. This is more than enough homology 
to demonstrate a reasonable probability that the utility of canine C5a anaphylatoxin receptor can be 
imputed to the claimed invention (through the polypeptide it encodes). It is well-known that the 
probability that two unrelated polypeptides share more than 40% sequence homology over 70 amino 
acid residues is exceedingly small. Brenner et al., Proc. Natl. Acad. Sci. 95:6073-78 (1998) 
(Reference No. 2). Given homology in excess of 40% over many more than 70 amino acid residues, 
the probability that the polypeptide encoded for by the claimed polynucleotide is related to canine C5a 
anaphylatoxin receptor is, accordingly, very high. 

The Examiner must accept the Appellants' demonstration that the homology between the 
polypeptide encoded for by the claimed invention and canine C5a anaphylatoxin receptor demonstrates 
utility by a reasonable probability unless the Examiner can demonstrate through evidence or sound 
scientific reasoning that a person of ordinary skill in the art would doubt utility. See In re hanger, 503 
F.2d 1380, 1391-92, 183 USPQ 288 (CCPA 1974). The Examiner has not provided sufficient 
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evidence or sound scientific reasoning to the contrary. 

Furthermore, confirmation of Appellants' identification of CALR as a human complement 
receptor is provided in Ames, R.S. et al. (1996; J. Biol. Chem., 271:20231-20334; "Molecular 
Cloning and Characterization of the Human Anaphylatoxin C3a Receptor"; Reference No. 3), in which 
the authors describe a human anaphylatoxin C3a receptor that has 98% sequence similarity to CALR. 

As discussed supra, Perret et al. (Reference No. 1) describe how the availability of the cloned 
receptors of this family are useful in drug screening. 

In addition the Specification describes how polynucleotides encoding CALR are expressed in 
human mast cells as well as the importance of mast cells in immune response. The Specification teaches 
that human mast cells have "an important role in promoting various immune responses and nonspecific 
inflammatory reactions" and "degranulate and discharge granule contents extracellularly," and further 
that: 

Mast cell granule contents include histamine, heparin, elastase, cathepsin G, eosinophil 
chemotactic factors, platelet activating factor, and slow-reacting substance of 
anaphylaxis. When complement cleavage products 3a, 4a, and 5a bind to their 
respective receptors on the surface of mast cells and basophils, they are capable of 
triggering the release of histamine and the other factors without the involvement of IgE. 
Some of the factors listed above are synthesized by mast cells during the course of 
hypersensitivity reactions and mediate vaso- and broncho-constriction leading to 
asthma. These and other mediators released following degranulation are responsible 
both for allergy symptoms and for immunity against some parasites. (Specification, 
pages 2-3.) 

The human mast cell line in which the claimed polynucleotide is expressed "was established from the 
peripheral blood of a Mayo Clinic patient with mast cell leukemia." (Specification, page 3.) The canine 
C5a receptor is "present on neutrophils, macrophages, and mast cells." (Specification, page 1, lines 9- 
10.) 

This disclosure provides adequate support for a "line of reasoning" linking the diseases listed on 
page 6 of the Specification with the claimed polynucleotide. One of skill in the art would reasonably 
believe that a receptor, expressed in human mast cells and highly similar to a canine C5a receptor, has 
utility at least in diagnosis and treatment of mast cell-associated immune conditions. 

Therefore, for at least the above reasons, the Specification provides adequate support for the 
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asserted utility of the claimed polynucleotide. 

D. Objective evidence corroborates the utilities of the claimed invention 

There is, in fact, no restriction on the kinds of evidence a Patent Examiner may consider in 
determining whether a "real-world" utility exists. Indeed, "real-world" evidence, such as evidence 
showing actual use or commercial success of the invention, can demonstrate conclusive proof of utility. 
Raytheon v. Roper, 220 USPQ2d 592 (Fed. Cir. 1983); Nestle v. Eugene, 55 F.2d 854, 856, 12 
USPQ 335 (6th Cir. 1932). Indeed, proof that the invention is made, used or sold by any person or 
entity other than the patentee is conclusive proof of utility. United States Steel Corp, v. Phillips 
Petroleum Co,, 865 R2d 1247, 1252, 9 USPQ2d 1461 (Fed. Cir. 1989). 

Over the past several years, a vibrant market has developed for databases containing all 
expressed genes (along with the polypeptide translations of those genes), in particular genes having 
medical and pharmaceutical significance such as the instant sequence. (Note that the value in these 
databases is enhanced by their completeness, but each sequence in them is independently valuable.) 
The databases sold by Appellants' assignee, Incyte, include exactly the kinds of information made 
possible by the claimed invention, such as tissue and disease associations. Incyte sells its database 
containing the claimed sequence and millions of other sequences throughout the scientific community, 
including to pharmaceutical companies who use the information to develop new pharmaceuticals. 

Both Incyte' s customers and the scientific community have acknowledged that Incyte' s 
databases have proven to be valuable in, for example, the identification and development of drug 
candidates. As Incyte adds information to its databases, including the information that can be generated 
only as a result of Incyte' s discovery of the claimed polynucleotide and its use of that polynucleotide on 
cDNA microarrays, the databases become even more powerful tools. Thus the claimed invention adds 
more than incremental benefit to the drug discovery and development process. 

III. The Patent Examiner's Rejections Are Without Merit 

Rather than responding to the evidence demonstrating utility, the Examiner attempts to dismiss it 
altogether by arguing that the disclosed and well-established utilities for the claimed polynucleotide are 
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not "specific and substantial credible" utilities. (Final Office Action, page 3.) The Examiner is incorrect 
both as a matter of law and as a matter of fact. 

A. The Precise Biological Role Or Function Of An Expressed Polynucleotide Is 
Not Required To Demonstrate Utility 

The Patent Examiner's primary rejection of the claimed invention is based on the ground that, 
without information as to the precise "biological role" of the claimed invention, the claimed invention's 
utility is not sufficiently specific. According to the Examiner, it is not enough that a person of ordinary 
skill in the art could use and, in fact, would want to use the claimed invention to monitor the expression 
of genes for such applications as the evaluation of a drug's efficacy and toxicity. The Examiner would 
require, in addition, that the applicant provide a specific and substantial interpretation of the results 
generated in any given expression analysis. 

It may be that specific and substantial interpretations and detailed information on biological 
function are necessary to satisfy the requirements for publication in some technical journals, but they are 
not necessary to satisfy the requirements for obtaining a United States patent. The relevant question is 
not, as the Examiner would have it, whether it is known how or why the invention works, In re 
Cortwright, 165 F.3d 1353, 1359 (Fed. Cir. 1999), but rather whether the invention provides an 
"identifiable benefit" in presently available form. Juicy Whip Inc. v. Orange Bang Inc., 185 F.3d 
1364, 1366 (Fed. Cir. 1999). If the benefit exists, and there is a substantial likelihood the invention 
provides the benefit, it is useful. There can be no doubt, that the present invention meets this test. 

The threshold for determining whether an invention produces an identifiable benefit is low. 
Juicy Whip, 185 F.3d at 1366. Only those utilities that are so nebulous that a person of ordinary skill 
in the art would not know how to achieve an identifiable benefit and, at least according to the PTO 
guidelines, so-called "throwaway" utilities that are not directed to a person of ordinary skill in the art at 
all, do not meet the statutory requirement of utility. Utility Examination Guidelines, 66 Fed. Reg. 1092 
(Jan. 5, 2001). 

Knowledge of the biological function or role of a biological molecule has never been required to 
show real-world benefit. In its most recent explanation of its own utility guidelines, the PTO 
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acknowledged so much (66 F.R. at 1095): 

[T]he utility of a claimed DNA does not necessarily depend on the function of the 
encoded gene product. A claimed DNA may have specific and substantial utility 
because, e.g., it hybridizes near a disease-associated gene or it has gene-regulating 
activity. 

By implicitly requiring knowledge of biological function for any claimed nucleic acid, the 
Examiner has, contrary to law, elevated what is at most an evidentiary factor into an absolute 
requirement of utility. Rather than looking to the biological role or function of the claimed invention, the 
Examiner should have looked first to the benefits it is alleged to provide. 

B. Membership in a Class of Useful Products Can Be Proof of Utility 

Despite the uncontradicted evidence that the claimed polynucleotide encodes a polypeptide in 
the C5a-like seven transmembrane receptor family, the Examiner refused to impute the utility of the 
members of the C5a-like seven transmembrane receptor family to CALR. In the Office Action, the 
Patent Examiner takes the position that, unless Appellants can identify which particular biological 
function within the class of C5a-like seven transmembrane receptors is possessed by CALR, utility 
cannot be imputed. To demonstrate utility by membership in the class of C5a-like seven 
transmembrane receptors, the Examiner would require that all C5a-like seven transmembrane receptors 
possess a "common" utility. 

There is no such requirement in the law. In order to demonstrate utility by membership in a 
class, the law requires only that the class not contain a substantial number of useless members. So long 
as the class does not contain a substantial number of useless members, there is sufficient likelihood that 
the claimed invention will have utility, and a rejection under 35 U.S.C. § 101 is improper. That is true 
regardless of how the claimed invention ultimately is used and whether or not the members of the class 
possess one utility or many. See Brenner v. Manson, 383 U.S. 519, 532 (1966); Application of 
Kirk, 376 F.2d 936, 943 (CCPA 1967). 

Membership in a "general" class is insufficient to demonstrate utility only if the class contains a 
sufficient number of useless members such that a person of ordinary skill in the art could not impute 
utility by a substantial likelihood. There would be, in that case, a substantial likelihood that the claimed 
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invention is one of the useless members of the class. In the few cases in which class membership did 
not prove utility by substantial likelihood, the classes did in fact include predominately useless members. 
E.g., Brenner (man-made steroids); Kirk (same); Natta (man-made polyethylene polymers). 

The Examiner addresses CALR as if the general class in which it is included is not the C5a-like 
seven transmembrane receptor family, but rather all polynucleotides or all polypeptides, including the 
vast majority of useless theoretical molecules not occurring in nature, and thus not pre-selected by 
nature to be useful. While these "general classes" may contain a substantial number of useless 
members, the C5a-like seven transmembrane receptor family does not. The C5a-like seven 
transmembrane receptors family is sufficiently specific to rule out any reasonable possibility that CALR 
would not also be useful like the other members of the family. 

Because the Examiner has not presented any evidence that the C5a-like class of seven 
transmembrane receptors has any, let alone a substantial number, of useless members, the Examiner 
must conclude that there is a "substantial likelihood" that the CALR encoded by the claimed 
polynucleotide is useful. It follows that the claimed polynucleotide also is useful. 

Even if the Examiner's "common utility" criterion were correct - and it is not - the C5a-like 
seven transmembrane receptor family would meet it. It is undisputed that known members of the C5a- 
like seven transmembrane receptor family are seven transmembrane receptors that bind complement 
and activate the immune function of mast cells. A person of ordinary skill in the art need not know any 
more about how the claimed invention binds complement and activates the immune function of mast 
cells to use it, and the Examiner presents no evidence to the contrary. Instead, the Examiner makes the 
conclusory observation that a person of ordinary skill in the art would need to know whether, for 
example, any given C5a-like seven transmembrane receptors binds complement and activates the 
immune function of mast cells. The Examiner then goes on to assume that the only use for CALR 
absent knowledge as to how the C5a-like seven transmembrane receptor actually works is further 
study of CALR itself. 

Not so. As demonstrated by Appellants, knowledge that CALR is a C5a-like seven 
transmembrane receptor is more than sufficient to make it useful for the diagnosis and treatment of mast 
cell-associated immune conditions. Indeed, CALR has been shown to be expressed in mast cells. The 
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Examiner must accept these facts to be true unless the Examiner can provide evidence or sound 
scientific reasoning to the contrary. But the Examiner has not done so. 

C. Because the uses of polynucleotides encoding CALR in drug discovery, and 
disease diagnosis are practical uses beyond mere study of the invention itself, 
the claimed invention has substantial utility. 

The PTO's rejection is tantamount to a rejection based on the polynucleotide being only a 

research tool and that the use of an invention as a tool for research is not a "substantial" use. Because 

the PTO's rejection assumes a substantial overstatement of the law, and is incorrect in fact, it must be 

overturned. 

There is no authority for the proposition that use as a tool for research is not a substantial utility. 

Indeed, the Patent Office has recognized that just because an invention is used in a research setting 

does not mean that it lacks utility (MPEP § 2107): 

Many research tools such as gas chromatographs, screening assays, and nucleotide 
sequencing techniques have a clear, specific and unquestionable utility (e.g., they are 
useful in analyzing compounds). An assessment that focuses on whether an invention is 
useful only in a research setting thus does not address whether the specific invention is 
in fact "useful" in a patent sense. Instead, Office personnel must distinguish between 
inventions that have a specifically identified utility and inventions whose specific utility 
requires further research to identify or reasonably confirm. 

The Patent Office's actual practice has been, at least until the present, consistent with that 
approach. It has routinely issued patents for inventions whose only use is to facilitate research, such as 
DNA ligases. These are acknowledged by the PTO's Training Materials themselves to be useful, as 
well as DNA sequences used, for example, as markers. 

Only a limited subset of research uses are not "substantial" utilities: those in which the only 
known use for the claimed invention is to be an object of further study, thus merely inviting further 
research. This follows from Brenner, in which the U.S. Supreme Court held that a process for making 
a compound does not confer a substantial benefit where the only known use of the compound was to 
be the object of further research to determine its use. Id. at 535. Similarly, in Kirk, the Court held that 
a compound would not confer substantial benefit on the public merely because it might be used to 
synthesize some other, unknown compound that would confer substantial benefit. Kirk, 376 F.2d at 
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940, 945 ("What appellants are really saying to those in the art is take these steroids, experiment, and 
find what use they do have as medicines.")- Nowhere do those cases state or imply, however, that a 
material cannot be patentable if it has some other beneficial use in research. 

As used in drug discovery and disease diagnosis, the claimed invention has a beneficial use in 
research other than studying the claimed invention or its protein products. It is a tool, rather than an 
object, of research. The data generated in gene expression monitoring using the claimed invention as a 
tool is not used merely to study the claimed polynucleotide itself, but rather to study properties of 
tissues, cells, and potential drug candidates and toxins. Without the claimed invention, the information 
regarding the properties of tissues, cells, drug candidates and toxins is less complete. 

The claimed invention has numerous additional uses as a research tool, each of which alone is a 
"substantial utility." These include uses in chromosomal mapping (Specification, page 9, line 23 through 
page 10, line 6.) 

D. The Patent Examiner Failed to Demonstrate That a Person of Ordinary Skill in 
the Art Would Reasonably Doubt the Utility of the Claimed Invention 

1. Drug screening is a specific, substantial and credible utility 

The Examiner argues that "[e]ven if the expression of Applicant's individual protein is affected 
by a test compound in an array for drug screening, the specification does not disclose any specific and 
substantial interpretation for the result, and none is known in the art. Given this consideration, the 
individually claimed antibody 1 has no 'well-established' use." (Final Office Action, page 5.) 

Contrary to the Examiner's allegation, there is indeed a "specific and substantial" interpretation 
for the results of drug screening and toxicology testing using the claimed polynucleotide. Monitoring the 
expression of the claimed polynucleotide is a method of testing the toxicology of drug candidates during 
the drug development process. If the expression of a particular polynucleotide is affected in any way 
by exposure to a test compound, and if that particular polynucleotide (or its encoded polypeptide) is 



! The Examiner in the quoted sentence refers to "the individually claimed antibody." Appellants 
note that the claims on appeal are directed to a polynucleotide and assume that the Examiner's 
reference to the "antibody" was made inadvertently. 
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not the specific target of the test compound (e.g., if the test compound is a drug candidate), then the 
change in expression is an indication that the test compound has undesirable toxic side effects that may 
limit its usefulness as a specific drug. Learning this from an array in a gene expression monitoring 
experiment early in the drug development process costs less than learning this, for example, during 
Phase DI clinical trials. It is important to note that such an indication of possible toxicity is specific not 
only for each compound tested, but also for each and every individual polynucleotide whose expression 
is being monitored. 

However, the Examiner continues to view the utility of the claimed polynucleotide in toxicology 
testing and drug screening as requiring knowledge of either the biological function or disease association 
of the polynucleotide. The Examiner views toxicology testing as a process to measure the toxicity of a 
drug candidate only when that drug candidate is specifically targeted to the claimed polynucleotide or its 
encoded polypeptide, alleging that "Applicant has failed to identify the consequences of identifying a 
compound which is toxic to a polypeptide encoded by the claimed polynucleotide." (Final Office 
Action, pages 3-4.). The Examiner has refused to consider that the claimed polynucleotide is useful for 
measuring the toxicity of drug candidates which are targeted not to the claimed polynucleotide or its 
encoded polypeptide, but to other polynucleotides or polypeptides. This utility of the claimed 
polynucleotide does not require any knowledge of the biological function or disease association of the 
claimed polynucleotide or its encoded polypeptide, and is a specific, substantial and credible utility. 
The Examiner provides neither evidence nor sound scientific reasoning, only unsupported personal 
opinion, to support the allegation that knowledge of "biological significance" or "disease association" is 
required for toxicology testing and drug screening. 

2. Irrelevance of disease association or differential expression to utility in 
toxicology testing 

The Examiner asserts that the specification does not disclose an association of the claimed 
polypeptides with "any disease or disorder," and therefore that "the artisan is required to perform 
substantial further experimentation on the claimed material itself in order to determine to what 'practical 
use' any expression information regarding this polynucleotide could be put." (Final Office Action, 
pages 4 and 5.) 
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These are irrelevant. Appellants need not demonstrate whether the claimed polynucleotide is 
associated with disease. Appellants need only demonstrate that the claimed polynucleotide is useful. 

The claimed polynucleotide can be used for toxicology testing in drug discovery without any 
knowledge of disease association. Monitoring the expression of the claimed polynucleotide gives 
important information on the potential toxicity of a drug candidate that is specifically targeted to any 
other polynucleotide or its encoded polypeptide, regardless of the disease association of the claimed 
polynucleotide. The claimed polynucleotide is useful for measuring the toxicity of drug candidates 
specifically targeted to other polynucleotides or their encoded polypeptides, regardless of any possible 
utility for measuring the properties of the claimed polynucleotide. 



The Examiner alleges that "toxicology testing and drug discovery in the specification as 
originally filed" and that "the particulars of toxicology testing with SEQ ED NO:2 are not disclosed in 
the instant specification." (Final Office Action, page 3.) Well-established utilities, such as toxicology 
testing, need not be explicitly disclosed in a patent application. Furthermore, the Examiner 's position 
amounts to nothing more than the Examiner's disagreement with Appellants' assertions about the 
knowledge of a person of ordinary skill. The Examiner must accept Appellants' assertions to be true. 
The Final Office Action fails to address the disclosure in the instant specification on gene and protein 
expression monitoring applications, as discussed below. 

Support for the utility of the claimed sequences in toxicology testing, as well as for utility in drug 
screening, may be found in the specification. For example, 

Because CALR is specifically expressed in cells active in immunity, the nucleic 
acid (calr), polypeptide (CALR) and antibodies to CALR are useful in investigations of 
and interventions in the normal and abnormal physiologic and pathologic processes 
which comprise the mast cell's role in immunity. Therefore, an assay for upregulated 
expression of CALR can accelerate diagnosis and proper treatment of conditions 
caused by abnormal signal transduction due to anaphylactic or hypersensitive 
responses, systemic and local infections, traumatic and other tissue damage, hereditary 
or environmental diseases associated with hypertension, carcinomas, and other 
physiologic or pathologic problems. 

The nucleotide sequence encoding CALR (or its complement) has numerous 
other applications in techniques known to those skilled in the art of molecular biology. 



3. 



Discussion of toxicology testing in the Specification 
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These techniques include use as hybridization probes for Southerns or northerns, use as 
oligomers for PCR, use for chromosomal and gene mapping, use in the recombinant 
production of CALR, use in generation of anti-sense DNA or RNA, their chemical 
analogs and the like, and use in production of chimeric molecules for selecting agonists, 
inhibitors or antagonists for design of domain-specific therapeutic molecules. 
(Coleman '355 application, page 6, lines 15-29.) 

The Coleman '355 application further teaches that: 

The nucleotide sequence can be used to develop an assay to detect activation, 
inflammation, or disease associated with abnormal levels of CALR expression. The 
nucleotide sequence can be labeled by methods known in the art and added to a fluid 
or tissue sample from a patient. After an incubation period sufficient to effect 
hybridization, the sample is washed with a compatible fluid which contains a visible 
marker, a dye or other appropriate molecule(s), if the nucleotide has been labeled with 
an enzyme. After the compatible fluid is rinsed off, the dye is quantitated and compared 
with a standard. If the amount of dye is significantly elevated (or lowered, as the case 
may be), the nucleotide sequence has hybridized with the sample, and the assay 
indicates an abnormal condition such as inflammation or disease. (Coleman '355 
application at page 9, lines 13-22.) 

4. Utility of all expressed polypeptides in toxicology testing 

The Examiner asserts that use as a control for toxicology testing is not specific and substantial, 
and therefore not well-established, because it "would apply to virtually every member of a general class 
of materials, such as any collection of proteins or DNAs, but it is only potential with respect to SEQ ID 
NO:2." (Final Office Action, page 4). The Examiner does not point to any law, however, that says a 
utility that is shared by a large class is somehow not a utility. If all of the class of polypeptides or 
polynucleotides can be so used, then they all have utility. The issue is, once again, whether the claimed 
invention has any utility, not whether other compounds have a similar utility. Nothing in the law says 
that an invention must have a "unique" utility. Indeed, the whole notion of "well established" utilities 
presupposes that many different inventions can have the exact same utility. If the Examiner's argument 
was correct, there could never be a well established utility, because you could always find a generic 
group with the same utility! 

Furthermore, the Examiner is incorrect in stating that "virtually every member of a general class 
of materials, such as any collection of proteins or DNAs" could be used in toxicology testing. (Final 
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Office Action, page 4.) The property of the claimed polynucleotide that makes it useful as a control for 
toxicology testing is its expression in naturally occurring cells. A polynucleotide having a random, non- 
naturally occurring sequence would most likely not be useful as a control for toxicology testing. 

The Examiner further asserts that "the information that is gained from the array is dependent on 
the pattern derived from the array, and says nothing with regard to each individual member of the array" 
and that this is, again, a general utility. (Final Office Action, pages 4-5.) Appellants note that while the 
information derived from an array does depend upon the pattern derived from individual members of 
the array, an array still cannot be made without individual members. Thus each individual 
polynucleotide sequence has a utility in creating arrays. Each of these individual polynucleotide 
sequences has a unique and specific utility in that it records the expression level of a unique gene. This 
is a substantial, "real world" utility in that one of ordinary skill in the art would know how to use the 
claimed polynucleotide's sequence in an array, without any further experimentation. 

IV. By Requiring the Patent Applicant to Assert a Particular or Unique Utility, the Patent 
Examination Utility Guidelines and Training Materials Applied by the Patent 
Examiner Misstate the Law 

There is an additional, independent reason to overturn the rejections: to the extent the rejections 
are based on Revised Interim Utility Examination Guidelines (64 FR 71427, December 21, 1999), the 
final Utility Examination Guidelines (66 FR 1092, January 5, 2001) and/or the Revised Interim Utility 
Guidelines Training Materials (USPTO Website www.uspto.gov, March 1, 2000), the Guidelines and 
Training Materials are themselves inconsistent with the law. 

The Training Materials, which direct the Examiners regarding how to apply the Utility 

Guidelines, address the issue of specificity with reference to two kinds of asserted utilities: "specific" 

utilities which meet the statutory requirements, and "general" utilities which do not. The Training 

Materials define a "specific utility" as follows: 

A [specific utility] is specific to the subject matter claimed. This contrasts to general 
utility that would be applicable to the broad class of invention. For example, a claim to 
a polynucleotide whose use is disclosed simply as "gene probe" or "chromosome 
marker" would not be considered to be specific in the absence of a disclosure of a 
specific DNA target. Similarly, a general statement of diagnostic utility, such as 
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diagnosing an unspecified disease, would ordinarily be insufficient absent a disclosure of 
what condition can be diagnosed. 

The Training Materials distinguish between "specific" and "general" utilities by assessing 
whether the asserted utility is sufficiently "particular," i.e., unique (Training Materials at p. 52) as 
compared to the "broad class of invention." (In this regard, the Training Materials appear to parallel 
the view set forth in Stephen G. Kunin, Written Description Guidelines and Utility Guidelines , 82 
J.P.T.O.S. 77, 97 (Feb. 2000) ("With regard to the issue of specific utility the question to ask is 
whether or not a utility set forth in the specification is particular to the claimed invention.")). 

Such "unique" or "particular" utilities never have been required by the law. To meet the utility 
requirement, the invention need only be "practically useful," Natta, 480 F.2d 1 at 1397, and confer a 
"specific benefit" on the public. Brenner, 383 U.S. at 534. Thus, incredible "throwaway" utilities, such 
as trying to "patent a transgenic mouse by saying it makes great snake food," do not meet this standard. 
Karen Hall, Genomic Warfare , The American Lawyer 68 (June 2000) (quoting John Doll, Chief of the 
Biotech Section of USPTO). 

This does not preclude, however, a general utility, contrary to the statement in the Training 
Materials where "specific utility" is defined (page 5). Practical real-world uses are not limited to uses 
that are unique to an invention. The law requires that the practical utility be "definite," not particular. 
Montedison, 664 F.2d at 375. Appellant is not aware of any court that has rejected an assertion of 
utility on the grounds that it is not "particular" or "unique" to the specific invention. Where courts have 
found utility to be too "general," it has been in those cases in which the asserted utility in the patent 
disclosure was not a practical use that conferred a specific benefit. That is, a person of ordinary skill in 
the art would have been left to guess as to how to benefit at all from the invention. In Kirk, for 
example, the CCPA held the assertion that a man-made steroid had "useful biological activity" was 
insufficient where there was no information in the specification as to how that biological activity could be 
practically used. Kirk, 376 F.2d at 941. 

The fact that an invention can have a particular use does not provide a basis for requiring a 
particular use. See Brana, supra (disclosure describing a claimed antitumor compound as being 
homologous to an antitumor compound having activity against a "particular" type of cancer was 
determined to satisfy the specificity requirement). "Particularity" is not and never has been the sine qua 
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non of utility; it is, at most, one of many factors to be considered. 

As described supra, broad classes of inventions can satisfy the utility requirement so long as a 
person of ordinary skill in the art would understand how to achieve a practical benefit from knowledge 
of the class. Only classes that encompass a significant portion of nonuseful members would fail to meet 
the utility requirement. Supra § H.B. (Montedison, 664 F.2d at 374-75). 

The Training Materials fail to distinguish between broad classes that convey information of 
practical utility and those that do not, lumping all of them into the latter, unpatentable category of 
"general" utilities. As a result, the Training Materials paint with too broad a brush. Rigorously applied, 
they would render unpatentable whole categories of inventions that heretofore have been considered to 
be patentable and that have indisputably benefitted the public, including the claimed invention. See 
supra § H.B. Thus the Training Materials cannot be applied consistently with the law. 

V. To the Extent the Rejection of the Claimed Invention under 35 U.S.C. § 112, First 
Paragraph, Is Based on the Improper Rejection for Lack of Utility under 35 U.S.C. 
§ 101, it Must Be Reversed. 

The rejection set forth in the Office Action is based on the assertions discussed above, i.e., that 
the claimed invention lacks patentable utility. To the extent that the rejection under § 1 12, first 
paragraph, is based on the improper allegation of lack of patentable utility under § 101, it fails for the 
same reasons. 

(10) CONCLUSION 

Appellants respectfully submit that rejections for lack of utility based, inter alia, on an 
allegation of "lack of specificity," as set forth in the Office Action and as justified in the Revised Interim 
and final Utility Guidelines and Training Materials, are not supported in the law. Neither are they 
scientifically correct, nor supported by any evidence or sound scientific reasoning. These rejections are 
alleged to be founded on facts in court cases such as Brenner and Kirk, yet those facts are clearly 
distinguishable from the facts of the instant application, and indeed most if not all nucleotide and protein 
sequence applications. Nevertheless, the PTO is attempting to mold the facts and holdings of these 
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prior cases, "like a nose of wax," 2 to target rejections of claims to polypeptides and polynucleotides 
where biological activity information has not been proven by laboratory experimentation, and they have 
done so by ignoring perfectly acceptable utilities fully disclosed in the specification as well as well- 
established utilities known to those of skill in the art. As is disclosed in the specification, and even more 
clearly, as one of ordinary skill in the art would understand, the claimed invention has well-established, 
specific, substantial and credible utilities. The rejections are, therefore, improper and should be 
reversed. 

Moreover, to the extent the above rejections were based on the Revised Interim and final 
Examination Guidelines and Training Materials, those portions of the Guidelines and Training Materials 
that form the basis for the rejections should be determined to be inconsistent with the law. 

Due to the urgency of this matter, including its economic and public health implications, an 
expedited review of this appeal is earnestly solicited. 

If the USPTO determines that any additional fees are due, the Commissioner is hereby 
authorized to charge Deposit Account No. 09-0108. 
This brief is enclosed in triplicate. 

Respectfully submitted, 
INCYTE GENOMICS, INC. 

Date: @Ctzb<^J, SLOO^ ^^U^c^ jJaXh*^ 

Susan K. Sather 
Reg. No. 44,316 

Direct Dial Telephone: (650) 845-4646 

3160 Porter Drive 
Palo Alto, California 94304 
Phone: (650) 855-0555 
Fax: (650) 849-8886 



2 "The concept of patentable subject matter under §101 is not 'like a nose of wax which may be 
turned and twisted in any direction * * *.' White v. Dunbar, 119 U.S. 47, 51." (Parker v. Flook, 
198 USPQ 193 (US SupCt 1978)) 
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APPENDIX - CLAIMS ON APPEAL 



12. (As Once Amended) An isolated polynucleotide comprising a polynucleotide sequence 
encoding the amino acid sequence of SEQ ID NO:2. 

13. (As Once Amended) An isolated polynucleotide comprising the polynucleotide 
sequence of SEQ ID NO: 1 . 

14. (As Once Amended) An isolated polynucleotide fully complementary to a 
polynucleotide comprising the polynucleotide sequence of SEQ ID NO:l. 

15. (As Once Amended) An expression vector comprising the isolated polynucleotide of 
claim 12. 

16. (Reiterated) A host cell comprising the expression vector of claim 15. 

17. (As Once Amended) A method for producing a polypeptide comprising the amino acid 
sequence of SEQ ID NO:2, said method comprising the steps of: 

(a) culturing the host cell of claim 16 under conditions suitable for expression of the 
polypeptide, and 

(b) recovering said polypeptide from the cell culture. 
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ABSTRACT Pairwise sequence comparison methods have 
been assessed using proteins whose relationships are known 
reliably from their structures and functions, as described in 
the scop database [Murzin, A G., Brenner, S. Hubbard T 
& Chothia C. (1995) J. Mol. Biol. 241, 536-540]. The evalua-' 
tion tested the programs BUST [AJtschul, S. F., Gish, W., 
Miller, W., Myers, E. W. & Lipman, D. J. (1990)./. Mol. Biol. 
215, 403-410], WU-BLAST2 [AlUchul, S. F. & Gish, W. (1996) 
Methods Enzymol. 266, 460.480], FaSTa fPearson, W. R. & 
Lipman, D. J. (1988) Proc. Nad. Acad. Set. USA 85, 2444-2448] 
and ssearch [Smith, T. F. & Waterman, M. S. (1981) /. Mol. 
Biol. 147, 195-197] and their scoring schemes. The error rate 
of all algorithms is greatly reduced by using statistical scores 
to evaluate matches rather than percentage identity or raw 
scores. The E-value statistical scores of SSEARCH and FASTa are 
reliable: the number of false positives found in our tests agrees 
well with the scores reported. However, the P-values reported 
by blast and wu-BLasti exaggerate significance by orders of 
magnitude, ssearch, fasta ktup = 1, and WU-8LAST2 perform 
best, and they are capable of detecting almost all relationships 
between proteins whose sequence identities are >30%. For 
more distantly related proteins, they do much less well; only 
one-half of the relationships between proteins with 20-30% 
identity are found. Because many homologs have low sequence 
similarity, most distant relationships cannot be detected by 
any pairwise comparison method; however, those which are 
identified may be used with confidence. 

Sequence database searching piays a role in virtually every 
branch of molecular biology and is crucial for interpreting the 
sequences issuing forth from genome projects. Given the 
method's central role, it is surprising that overall and relative 
capabilities of different procedures are largely unknown. It is 
difficult to verify algorithms on sample data because this 
requires large data sets of proteins whose evolutionary rela- 
tionships are known unambiguously and independently of the 
methods being evaluated. However, nearly all known ho- 
mologs have been identified by sequence analysis (the method 
to be tested). Also, it is generally very difficult to know, in the 
absence of structural data, whether two proteins that lack clear 
sequence similarity are unrelated. This has meant that al- 
though previous evaluations have helped improve sequence 
comparison, they have suffered from insufficient, imperfectly 
characterized, or artificial test data. Assessment also has been 
problematic because high quality database sequence searching 
attempts to have both sensitivity (detection of homologs) and 
specificity (rejection of unrelated proteins); however, these 
complementary goals are linked such that increasing one 
causes the other to be reduced. 

The publication costs of ihis article were defrayed in part by page charge 
payment. This article must therefore be herebv marked "advtrrucmenr in 
accordance with 18 VSC $1734 solely to indicate this fact. 

C 1998 by The National Academy of Sciences 0027-S424/98/9S6073.6S2.00/0 
PNAS is available online at http://www.pnas.org. 



Sequence comparison methodologies have evolved rapidly 
so no previously published tests has evaluated modern versions 
of programs commonly used. For example, parameters in 
BLAST (1) have changed, and wu-bl^ST? (2)— which produces 
gapped alignments— has become available. The latest version 
of fasta (3) previously tested was 1.6, but the current release 
(version 3.0) provides fundamentally different results in the 
form of statistical scoring. 

Tne previous reports also have left gaps in our knowledge. 
For example, there has been no published assessment of 
thresholds for scoring schemes more sophisticated than per- 
centage identity. Thus, the widely discussed statistical scoring 
measures have never actually been evaluated on large data- 
bases of real proteins. Moreover, the different scoring schemes 
commonly in use have not been compared. 

Beyond these issues, there is a more fundamental question- 
in an absolute sense, how well does pairwise sequence com- 
parison work? That is, what fraction of homologous proteins 
can be detected using modern database searching methods? 

In this work, we attempt to answer these questions and to 
overcome both of the fundamental difficulties that have hin- 
dered assessment of sequence comparison methodologies. 
First, we use the set of distant evolutionary relationships in the 
scop: Structural Classification of Proteins database (4), which 
is derived from structural and functional characteristics (5). 
The scop database provides a uniquely reliable set of ho- 
mologs, which are known independently of sequence compar- 
ison. Second, we use an assessment method that jointly mea- 
sures both sensitivity and specificity. This method allows 
straightforward comparison of different sequence searching 
procedures. Further, it can be used to aid interpretation of real 
database searches and thus provide optimal and reliable 
results. 

Previous Assessments of Sequence Comparison. Several 
previous studies have examined the relative performance of 
different sequence comparison methods. The most encom- 
passing analyses have been by Pearson (6, 7). who compared 
the three most commonly used programs. Of these, the Smith- 
Waterman algorithm (8) implemented in ssearch (3) is the 
oldest and slowest but the most rigorous. Modem heuristics 
have provided blast (1) the speed and convenience to make 
it the most popular program. Intermediate between these two 
is fasta (3). which may be run in two modes offering either 
greater speed (ktup = 2) or greater effectiveness (ktup = 1). 
Pearson also considered different parameters for each of these 
programs. 

To test the methods, Pearson selected two representative 
proteins from each of 67 protein superfamilies defined by the 
pir database (9). Each was used as a query to search the 
database, and the matched proteins were marked as being 
homologous or unrelated according to their membership of pir 

Abbreviation: EPQ, errors per query. 
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superfamilies. Pearson found thai modern matrices and "In- 
scaling" of raw scores improve results considerably. He also 
reported that the rigorous Smith- Waterman algorithm worked 
slightly better than fasta, which was in turn more effective 
than blast. 

Very large scale analyses of matrices have been performed 
(10), and Henikoff and Henikoff (11) also evaluated the 
effectiveness of blast and fasta. Their test with blast 
considered the ability to detect homologs above a predeter- 
mined score but had no penalty for methods which also 
reported large numbers of spurious matches. The Henikoffs 
searched the swiss-PROT database (12) and used PROStTE (13) 
to define homologous families. Their results showed that the 
BLOSUM62 matrix (14) performed markedly better than the 
extrapolated PAM-series matrices (15), which previously had 
been popular. 

A crucial aspect of any assessment is the data that are used 
to test the ability of the program to find homologs. But in 
Pearson's and the Henikoffs* evaluations of sequence com- 
parison, the correct results were effectively unknown. This is 
because the superfamilies in pir and prosite are principally 
created by using the same sequence comparison methods 
which are being evaluated. Interdependency of data and 
methods creates a "chicken and egg" problem, and means for 
example, that new methods would be penalized for correctly 
identifying homologs missed by older programs. For instance, 
immunoglobulin, variable and constant domains are clearly 
homologous, but pir places them in different superfamilies. 
The problem is widespread: each superfamily in pir 48.00 with 
a structural homolog is itself homologous to an average of 1.6 
other pir superfamilies (16). 

To surmount these sorts of difficulties, Sander and Schnei- 
der (17) used protein structures to evaluate sequence com- 
parison. Rather than comparing different sequence compari- 
son algorithms, their work focused on determining a length- 
dependent threshold of percentage identity, above which all 
proteins would be of similar structure. A result of this analysis 
was the hssp equation; it states that proteins with 25% identity 
over 80 residues will have similar structures, whereas shorter 
alignments require higher identity. (Other studies also have 
used structures (18-20), but these focused on a small number 
of model proteins and were principally oriented toward eval- 
uating alignment accuracy rather than homology detection.) 

A general solution to the problem of scoring comes from 
statistical measures (i.e., E-values and P-values) based on the 
extreme value distribution (21). Extreme value scoring was 
implemented analytically in the BLAST program using the 
Karlin and Altschul statistics (22, 23) and empirical ap- 
proaches have been recently added to Fasta and SSEARCH. In 
addition to being heralded as a reliable means of recognizing 
significantly similar proteins (24, 25), the mathematical trac- 
tability of statistical scores "is a crucial feature of the blast 
algorithm" (1). The validity of this scoring procedure has been 
tested analytically and empirically (see ref. 2 and references in 
ref. 24). However, all large empirical tests used random 
sequences that may lack the subtle structure found within 
biological sequences (26, 27) and obviously do not contain any 
real homologs. Thus, although many researchers have sug- 
gested that statistical scores be used to rank matches (24, 25, 
28). there have been no large rigorous experiments on biolog- 
ical data to determine the degree to which such rankings are 
superior. 

A Database for Testing Homology Detection. Since the 
discovery that the structures of hemoglobin and myoglobin are 
very similar though their sequences are not (29), it has been 
apparent that comparing structures is a more powerful (if less 
convenient) way to recognize distant evolutionary relation- 
ships than comparing sequences. If two proteins show a high 
degree of similarity in their structural details and function, it 
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is very probable that they have an evolutionary relationship 
though their sequence similarity may be low. 

The recent growth of protein structure information com- 
bined with the comprehensive evolutionary classification in 
the scop database (4, 5) have allowed us to overcome previous 
limitations. With these data, we can evaluate the performance 
of sequence comparison methods on real protein sequences 
whose relationships are known confidently. The scop database 
uses structural information to recognize distant homologs, the 
large majority of which can be determined unambiguously. 
These superfamilies, such as the globins or the immunoglobu- 
lins, would be recognized as related bv the vast majority of the 
biological community despite the lack of high sequence sim- 
ilarity. 

From scop, we extracted the sequences of domains of 
proteins in the Protein Data Bank (pdb) (30) and created two 
databases. One (pdbwd-B) has domains, which were all <90% 
identical to any other, whereas (PDB40D-B) had those <40% 
identical. The databases were created by first sorting all 
protein domains in scop by their quality and making a list. The 
highest quality domain was selected for inclusion in the 
database and removed from the list. Also removed from the list 
(and discarded) were all other domains above the threshold 
level of identity to the selected domain. This process was 
repeated until the list was empty. The PDB40D-B database 
contains 1,323 domains, which have 9,044 ordered pairs of 
distant relationships, or -05% of the total 1,749,006 ordered 
pain. In PDB90D-B, the 2,079 domains have 53,988 relation- 
ships, representing 1.2% of all pairs. Low complexity regions 
of sequence can achieve spurious high scores, so these were 
masked in both databases by processing with the seg program 
(27) using recommended parameters: 12 1.8 2.0. The databases 
used in this paper are available from http://sss.stanford.edu/ 
sss/, and databases derived from the current version of SCOP 
may be found at http://scop.mrc-lmb.cam.ac.uk/scop/. 

Analyses from both databases were generally consistent, but 
PDB40D-B focuses on distantly related proteins and reduces the 
heavy overrepresentation in the PDB of a small number of 
families (31, 32), whereas PDB90D-B (with more sequences) 
improves evaluations of statistics. Except where noted other- 
wise, the distant homolog results here are from PDB40D-& 
Although the precise numbers reported here are specific to the 
structural domain databases used, we expect the trends to be 
general. 

Assessment Data and Procedure. Our assessment of se- 
quence comparison may be divided into four different major 
categories of tests. First, using just a single sequence compar- 
ison algorithm at a time, we evaluated the effectiveness of 
different scoring schemes. Second, we assessed the reliability 
of scoring procedures, including an evaluation of the validity 
of statistical scoring. Third, we compared sequence compari- 
son algorithms (using the optimal scoring scheme) to deter- 
mine their relative performance. Fourth, we examined the 
distribution of homologs and considered the power of pairwise 
sequence comparison to recognize them. All of the analyses 
used the databases of structurally identified homologs and a 
new assessment criterion. 

The analyses tested BLAST (1), version 1.4.9MP, and wu- 
BLAST2 (2), version 2.0al3MP. Also assessed was the fasta 
package, version 3.0l76 (3), which provided fasta and the 
ssearch implementation of Smith-Waterman (8). For 
ssearch and fasta, we used BLOSUM45 with gap penalties 
-12/-1 (7, 16). The default parameters and matrix (BLO- 
SUM62) were used for blast and wu-blastz 

The "Coverage Vs. Error" Plot. To test a particular protocol 
(comprising a program and scoring scheme), each sequence 
from the database was used as a query to search the database. 
This yielded ordered pairs of query and target sequences with 
associated scores, which were sorted, on the basis of their 
scores, from best to worst. The ideal method would have 
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perfect separation, with all of the homologs at the top of the 
list and unrelated proteins below. In practice, perfect separa- 
tion is impossible to achieve so instead one is interested in 
drawing a threshold above which there are the largest number 
of related pairs of sequences consistent with an acceptable 
error rate. 

Our procedure involved measuring the coverage and error 
for every threshold. Coverage was defined as the fraction of 
structurally determined homologs that have scores above the 
selected threshold; this reflects the sensitivity of a method 
Errors per query (EPQ), an indicator of selectivity, is the 
number of nonhomologous pairs above the threshold divided 
by the number of queries. Graphs of these data, called 
coverage vs. error plots, were devised to understand how 



protocols compare at different levels of accuracy. These 
graphs share effectively all of the beneficial features of Re- 
cover Operating Characteristic (ROC) plots (33, 34) but 
better represent the high degrees of accuracy required in 
sequence comparison and the huge background of nonho- 
mologs. 

This assessment procedure is directly relevant to practical 
sequence database searching, for it provides precisely the 
information necessary to perform a reliable sequence database 
search. The EPQ measure places a premium on score consis- 
tency; that is, it requires scores to be comparable for different 
queries. Consistency is an aspect which has been largely 
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Fig. 2. Unrelated proteins with high percentage identity. Hemo- 
globin 0-chain (pdb code lhds chain b. ref. 38. Left) and cellulase E2 
PDB code Itml ref. 39, Right) have 39% identity over 64 residues, a 
level which is often believed to be indicative of homology. Despite this 
high degree of identity, their structures strongly suggest that these 
«o«^8s re n n0 ;K re l: alCd ( A Pf r °P" atc| y- "^ther the raw alignment 
^smol (40? significant. Proteins rendered by 
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Alignment length 

Fig. 3. Length and percentage identity of alignments of unrelated 
proteins in pdbwd-b: Each pair of nonhomologous proteins found with 
ssearch is plotted as a point whose position indicates the length and 
the percentage identity within the alignment. Because alignment 
length and percentage identity are quantired, many pairs of proteins 
may have exactly the same alignment length and percentage identity 
TTteUne shows the hssp threshold (though it is intended to be applied 
with a different matrix and parameters). 
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Fig. 4. Reliability of statistical scores in pdotod-b: Each line shows ' 
the relationship between reported statistical score and actual error 
rate for a different program. E- values are reported for ssearch and 
fast a, whereas P-values are shown for blast and wu-blash. If the 
scoring were perfect, then the number of errors per query and the 
E-values would be the same, as indicated by the upper bold tine 
(P-values should be the same as EPQ for small numbers, and diverges 
at higher values, as indicated by the lower bold line.) E-values from 
ssearch and fasta are shown to have good agreement with EPQ but 
underestimate the significance slightly, blast and wu-blash are 
overconfident, with the degree of exaggeration dependent upon the 
score. The results for PDB40D-B were similar to those for PDBWD-b 
despite the difference in number of homologs detected. This graph 
could be used to roughly calibrate the reliability of a given statistical 
score. 

ignored in previous tests but is essential for the straightforward 
or automatic interpretation of sequence comparison results. 
Further, it provides a clear indication of the confidence that 
should be ascribed to each match. Indeed, the EPQ measure 
should approximate the expectation value reported by data- 
base searching programs, if the programs' estimates are accu- 
rate. 

The Performance of Scoring Schemes. All of the programs 
tested could provide three fundamental types of scores. The 
first score is the percentage identity, which may be computed 
in several ways based on either the length of the alignment or 
the lengths of the sequences. The second is a "raw" or 
"Smith-Waterman" score, which is the measure optimized by 
the Smith-Waterman algorithm and is computed bv summing 
the substitution matrix scores for each position in the align- 
ment and subtracting gap penalties. In Blast, a measure 
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related to this score is scaled into bits. Third is a statistical 

score based on the extreme value distribution. These results 

are summarized in Fig. 1. 

Sequence Identity. Though it has been lone established that 
percentage identity is a poor measure (35). there is a common 
rule-of-thumb stating that 30% identity signifies homology. 
Moreover, publications have indicated that 25% identity can 
be used as a threshold (17, 36). We find that these thresholds, 
originally derived years ago, are not supported by present 
results. As databases have grown, so have the possibilities for 
chance alignments with high identity; thus, the reported cutoffs 
lead to frequent errors. Fig. 2 shows one of the many pairs of 
proteins with very different structures that nonetheless have 
high levels of identity over considerable aligned regions. 
Despite the high identity, the raw and the statistical scores for 
such incorrect matches are typically not significant. The prin- 
cipal reasons percentage identity does so poorly seem to be 
that it ignores information about gaps and about the conser- 
vative or radical nature of residue substitutions. 

From the pdbmd-b analysis in Fig. 3, we learn that 30% 
identity is a reliable threshold for this database only for 
sequence alignments of at least 150 residues. Because one 
unrelated pair of proteins has 43.5% identity over 62 residues, 
it is probably necessary for alignments to be at least 70 residues 
in length before 40% is a reasonable threshold, for a database 
of this particular size and composition. 

At a given reliability, scores based on percentage identity 
detect just a fraction of the distant homologs found by 
statistical scoring. If one measures the percentage identity in 
the aligned regions without consideration of alignment length, 
then a negligible number of distant homologs are detected. 
Use of the hssp equation improves the value of percentage 
identity, but even this measure can find only 4% of all known 
homologs at 1% EPQ. In short, percentage identity discards 
most of the information measured in a sequence comparison. 

Raw Scores. Smith-Waterman raw scores perform better 
than percentage identity (Fig. 1). but ln-scaiing (7) provided no 
notable benefit in our analysis. It is necessary to be very precise 
when using either raw or bit scores because a 20% change in 
cutoff score could yield a tenfold difference in EPQ. However, 
it is difficult to choose appropriate thresholds because the 
reliability of a bit score depends on the lengths of the proteins 
matched and the size of the database. Raw score thresholds 
also are affected by matrix and gap parameters. 

Statistical Scores. Statistical scores were introduced partly 
to overcome the problems that arise from raw scores. This 
scoring scheme provides the best discrimination between 
homologous proteins and those which are unrelated. Most 
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likely, its power can be attributed to its incorporation of more 
information than any other measure; it takes account of the 
full substitution and gap data (like raw scores) but also has 
details about the sequence lengths and composition and is 
scaled appropriately. 

We find that statistical scores are not only powerful, but also 
easy to interpret. ssEarch and Fasta show close agreement 
between statistical scores and actual number of errors per 
query (Fig. 4). The expectation value score gives a good, 
slightly conservative estimate of the chances of the two se* 
quences being found at random in a given query. Thus, an 
E-value of 0.01 indicates thai roughly one pair of nonhomologs 
of this similarity should be found in every 100 different queries. 
Neither raw scores nor percentage identity can be interpreted 
in this way, and these results validate the suitability of the 
extreme value distribution for describing the scores from a 
database search. 

The P-values from blast also should be directly interpret- 
able but were found to overstate significance by more than two 
orders of magnitude for \% EPQ for this database. Nonethe- 
less, these results strongly suggest that the analytic theory is 
fundamentally appropriate, wu-blast? scores were more re- 
liable than those from blast, but also exaggerate expected 
confidence by more than an order of magnitude at \% EPQ. 

Overall Detection or Homologs and Comparison of Algo- 
rithms. The results in Fig. SA and Table 1 show that pairwise 
sequence comparison is capable of identifying only a small 
fraction of the homologous pairs of sequences in PDB40D-B. 
Even ssearch with E-values, the best protocol tested, could 
find only 18% of all relationships at a \% EPQ. BUVST, which 
identifies 15%, was the worst performer, whereas Fasta 
ktup = 1 is nearly as effective as ssearch. fasta ktup = 2 and 
WU-BLAST2 are intermediate in their ability to detect ho- 
mologs. Comparison of different algorithms indicates that 
those capable of identifying more homologs are generally 
slower, ssearch is 25 times slower than BLAST and 6.5 times 
slower than fasta ktup = 1. wu-blast2 is slightly faster than 
fasta ktup = 2, but the latter has more interpretable scores. 

In PDB90D-B, where there are many close relationships, the 
best method can identify only 38% of structurally known 
homologs (Fig. SB). The method which finds that many 
relationships is wu-blastz Consequently, we infer that the 
differences between fasta kup = 1, ssearch, and wu-biast2 
programs are unlikely to be significant when compared with 
variation in database composition and scoring reliability. 

Fig. 6 helps to explain why most distant homologs cannot be 
found by sequence comparison: a great many such relation- 
ships have no more sequence identity than would be expected 
by chance, ssearch with E-values can recognize >90% of the 
homologous pairs with 30-40% identity. In this region, there 
are 30 pairs of homologous proteins that do not have signif- 
icant E-values, but 26 of these involve sequences with <50 
residues. Of sequences having 25-30% identity, 75% are 
identified by ssearch E-values. However, although the num- 
ber of homologs grows at lower levels of identity, the detection 
falls off sharply: only 40% of homologs with 20-25% identity 
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Fig. 6 Dismbution and detection of homologs in rowoD-B Ban 
show the distribution of homologous pain tdimod-b according to their 
identity (using the measure of identity in both). Filled regions indicate 
the number of these pairs found by the best database searching method 
(ssearch with E-values) at }% EPQ. The pdimod-b dat.bas^coma^ 
proteins with <40% identity, and as shown on this graph, most 
structurally identified homologs in the database have diverged ex- 
tremery far in sequence and have <20% identity. Note that the 
alignments may be inaccurate, especially at low levels of identity Filled 
regions show that ssearch can identify most relationships that have 
25% or more identity, but its detection wanes sharply below 25% 
Consequently the great sequence divergence of most structurally 
identified evolutionary relationships effectively defeats the ability of 
panwise sequence comparison to detect them. 

are detected and only 10% of those with 15-20% can be found 
These results show that statistical scores can find related 
proteins whose identity is remarkably low; however, the power 
of the method is restricted by the great divergence of many 
protein sequences. 

After completion of this work, a new version of pairwise 
blast was released: buksjgp (37). It supports gapped align- 
ments, like wu-Bi^\ST2, and dispenses with sum statistics. Our 
initial tests on blastgp using default parameters show that its 
E-values are reliable and that its overall detection of homologs 
was substantially better than that of ungapped blast, but not 
quite equal to that of wu-biast:. 

CONCLUSION 

The general consensus amongst experts (see refs. 7, 24, 25. 27 
and references therein) suggests that the most effective se- 
quence searches are made by (/) using a large current database 
in which the protein sequences have been complexity masked 
and (//) using statistical scores to interpret the results. Our 
experiments fully support this view. 

Our results also suggest two further points. First, the E-val- 
ues reported by fasta and ssearch give fairly accurate 
estimates of the significance of each match, but the P-values 
provided by blast and wu-BLAST2 underestimate the true 



Table 1. Summary of sequence comparison methods with pdbwd-b 


Method 


Relative Time* 


1% EPQ Cutoff 


Coverage at 1% EPQ 


ssearch % identity: within alignment 
ssearch % identity: within both 
ssearch % identity: Hssp-scaled 
ssearch Smith- Waterman raw scores 
SSEARCH E-values 
Fasta ktup ■ 1 E-values 
fasta ktup » 2 E-values 
WU-BIAST2 P-values 
BLAST P-values 


25J 
25.5 
25.5 
25.5 
25.5 
3.9 
1.4 
1.1 
1.0 


>1Q% 
$4% 

35% (hssp + 9.8) 
142 
0.03 
0.03 
0.03 
0.003 
0.00016 


<0.1 
3.0 
4.0 
10.5 
18.4 
17.9 
16.7 
17.5 
14.8 


•Times are from large database searches with 


genome proteins. 







6078 Biochemistry: Brenner et al. 



Proc. Natl. Acad. Sci. USA 95 (1998) 



extent of errors. Second, ssearch. wu-blastz and fasta 
letup = 1 perform best, though blast and fasta ktup » 2 
detect most of the relationships found by the best procedures 
and are appropriate for rapid initial searches. 

The homologous proteins that are found by sequence com- 
parison can be distinguished with high reliability from the huge 
number of unrelated pairs. However, even the best database 
searching procedures tested fail to find the large majority of 
distant evolutionary relationships at an acceptable error rate. 
Thus, if the procedures assessed here fail to find a reliable 
match, it does not imply that the sequence is unique; rather, it 
indicates that any relatives it might have are distant ones.** 



Additional and updated information about this work, including 
supplementary figures, may be found at http://isjjtanford.edu/us/. 
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A cDNA clone, DTJP03, encoding an orphan receptor, was isolated from a canine thyroid library, and found to exhibit 
68.6 ° 0 amino-acid identity with the recently described human C5a receptor. This relatively low similarity first suggested 
that DTJP03 encoded either a C5a receptor subtype, or the presumably related C3a receptor. Binding studies performed 
on membranes from COS-7 cells expressing the recombinant receptor demonstrated that DTJP03 encoded a hieh-afllnity 
C5a receptor, with a K, x of 1.2 n.M. C3a was unable to compete for C5a binding. Intracellular free calcium concentrations 
were measured by Quin-2 fluorescence assays in Chinese hamster ovary cells stably transfected with the canine C5a 
receptor. C5a addition elicited an increase in the intracellular calcium concentration. Extracellular EGTA partially 
prevented this response, suggesting that activation of the C5a receptor promotes both the release of calcium from 
intracellular stores, and the influx of extracellular calcium. Genes encoding C5a-rcccptor subtypes were subsequently 
searched for by PCR in genomic DNA from human, canine, rat and bovine sources. The result was the amplification of 
a single gene fragment from each species, with about 70% identity between any two of them. The canine C5a receptor 
has therefore to be considered as orthologous to the human C5a receptor described previously. The low similarity 
between C5a receptors from different mammalian species is quite unusual for a G-protein-coupIed receptor. 



INTRODUCTION 

In the course of complement activation, specific cleavage of 
the C5 component releases the anaphylatoxin C5a, a 74 amino- 
acid peptide (Fernandez & Hugli, 1978; Hugh. 1981). Binding of 
C5a to its specific membrane receptor induces physiological 
responses in a variety of cell types. //; viro, C5a is a potent 
mediator of the acute inflammatory response. /// vitro, C5a exerts 
chemotaxis on macrophages and polymorphonuclear leucocytes 
and is a powerful stimulator of neutrophil function. It induces 
exocylosis of lysosomal hydrolytic enzymes, enhances production 
of superoxide radicals, promotes neutrophil lcukotriene B^ 
synthesis, as well as aggregation, adherence and neutrophil 
margination. In addition, C5a has spasmogenic effects, 
stimulating smooth muscle contraction, increasing vascular per- 
meability and promoting mast-cell degranulation and histamine 
release. C5a also induces serotonin release from platelets, 
enhances interlcukin-1 (IL-I) secretion from macrophages and 
up-regulatcs surface expression of the complement receptors 
CRI and CR3 (for review sec Fearon & Wonc. 1983; Goldstein, 
1988; Franck & Fries, 1991). 

It was known that the C5a receptor belonged to the G-protcin- 
coupled family, but different G-proteins and intracellular path- 
ways seem to be involved in the signal transduction. Most of the 
eflects induced by C5a are mediated through coupling of the C5a 
receptor to pertussis toxin (PT)-sensitivc G-protein(s) (Warner et 
at., 1987; Nourshargh & Williams, 1990; Rollins et at., 1991). 
Some eflects are, however, insensitive to PT treatment (Monk & 
Banks, \99\a,b). The requirement for extracellular calcium also 
depends on the assay system (Zimmerli et at., 1990; Dore et at., 
1990; Kcrncn et at., 1991). 

Using degenerate primers corresponding to the conserved 
transmembrane segments of the G-protcin-coupled-rcceptor 



supcrfamily, we have isolated by PCR (Saiki et at., 1988) a scries 
of orphan receptors from cither cDNA or genomic DNA (Libert 
et at., 1989; Parmcnticr a at., 1989). Some of these orphan 
receptors have since been identified (Macnhaut et at., 1990; 
Libert et at., 1991 ; Parmenticr et at., 1992). 

Recently, the cloning of the human C5a receptor was reported 
(Gerard & Gerard, 1991 ; Boulay et at., 1991) and sequence 
comparison revealed a 68.6 °„ identity with one of our orphan 
receptors, DTJP03. In this paper we present results confirming 
that DTJP03 encodes the canine C5a receptor. In the search of 
potential subtypes, partial clones encoding the canine, human, 
rat and bovine C5a receptors were amplified by PCR. A single 
type was obtained for each s;-;*cies. sharing about 70 \ identity 
with one another. This represents a surprisingly high interspecies 
variability as compared with other G-protcin-coupIed receptors. 

MATERIALS AND METHODS 
Cloning and sequencing 

PCR was performed on 0.15 //g of purified cDNA as described 
previously (Libert et at., 1989; Parmenticr et at., I9S9), and the 
amplification product was cloned in M 13 vectors for sequencing. 
JP03, a 600 bp fragment encoding part of an orphan receptor, 
was use J to screen a canine thyroid Agtl I cDNA library (Lcfort 
et at., 1 989). The two positive clones were purified to homogeneity 
and the EcoRl cDNA inserts were subcloncd in pBIucscript 
SK-f plasmid vector (Stratagene). After subcloning of over- 
lapping restriction fragments in MI3mpl8 and 19, the clones 
were sequenced on both strands by the uideoxynuclcotidc-chain- 
termination method (Sanger et at., 1977), using an automated 
DNA sequencer (Applied Biosystcms 370A). 

The coding region was cloned as a 1350 bp EvoM-Pstl 
fragment in pBIucscript SK+ and further subcloncd as a 



Abbreviations used: IL-1 interleukin 1; IL-8, intcrlcukin 8: PT. pertussis toxin; CHO. Chinese hamster ovary: fMLP. formvl Mct-Lcu-Phc- ICL 
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*l} CCCCAOTWCTGCGCraXTGCA^ 

151 POLAHAACSVAWAVALLLTVPSPI F R- G V HT 180 
541 C^GTACTTTCCCTTCTt^TCACCTK 

181 EYFPPWH7CGVDYSGVGVLVERGV A I I • TTl 2lS 
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631 ATGCCCTTCCTGGCCCCGCTCCTGATACTGAIXATCTCCTACA^ „ ft 
211 MGFLCPLV ILSICYTFLLIRTHSRKAT R S T 240 



721 ^^^^^ 8X0 



901 
301 



991 ™f™»f^™^^ ^ 

1081 CTCCCACCTGTCCCTTCCCCCTCTCCTGA^ n?0 

1171 CTOCCCGCACCCGTCCTCCCTCACTTCCAG^ ^ 

1261 AAAGCWIAAACC AGC ATT ACTCGGAGCAC CTCC C AAT ACGGCTT ATCTCTGCTOCACCTGC AC ATTCTGCATGGG AC AC AGTCC ATACAT 1350 

1351 AGAAAAGAGAC AAAT AGGAACATTCT AAC CTTGGGCTGC CTGGGTGGCTC GCTT AAGC ATCTGCCTT AGtXTTCAGCTCATG ATCC C ACGG 1440 

1441 TCCTCGCTTCCAGTCCT^ ^ 

1531 T AAAT AAAT ACAAATATTTAAAAAAATCAATGAAT AC AATCTT AAACAAAAAGAAAGAAAC ACTCTAACCTTT AAAAAAATCGTGATCTG 1620 

1621 T1TATTTTACAGAGACCTGGCGCAAAAAAAACCTAAGG mQ 

1711 CTTTTCTTTTCTTT^^ ^ 

1801 TTACTAATCGGGTTTTCAGAAAACATATTCAGTAA ^ 

1891 AC^CGCTCCGAAACATT^rrTACGCACC^ ^ 

Fig. 1. Nucleotide and deduced amioo-acid sequences of the canine C5a receptor cDNA 

Numbering is relative to the putative initiation codon which presents a satisfactory sequence context according to Kozak (I9S9). The putative 
transmembrane segments are .nd.cated by roman numerals I lo VII. • indicates every tenth amino acid. 



Bamh\-Xhol fragment in the cukaryotic expression vector pSVL 
(Pharmacia). 

Sequence handling and data analysis were carried out using 
DNAS1S/PROSIS software (Hitachi), LWL85 software (Li & 
Luo, 1985) and the GCG/VMS software package (Genetic 
Computer Group, W], U.S.A.). 

Transfections and cell cultures 

The pSVL/DTJP03 construct was transfected in COS-7 cells 
as described previously (Gerard et at. t J 99 1). Chinese hamster 
ovary (CHO) CHO-KI cells were co-transfected with 
pSVL/DTJP03 and pSV,Neo as described (Perret et al., 1990). 
After 10 days of G418 selection, resistant cells were pooled and 
stored in liquid nitrogen until needed. COS-7 and CHO-KI cells 
were cultured using Dulbecco's modified Eagle's medium and 



Ham's F12 medium respectively, as described previously (Perret 
et a/.. 1990). Clonal CHO cell lines were produced by low- 
density-culture seeding and harvesting of individual clones using 
sterile rings. 

Membrane preparations 

COS-7 cell membranes were prepared 72 h after transfection. 
CHO-KI cells were prepared using nearly confluent cultures. 
Membrane preparation was as previously described (Perret et aL, 
1990). Protein content was determined using the Bradford assay 
(Bradford, 1976). 

Ligands 

Human C5a, human C3a, ,JS I-]abeIIed human C5a and U5 I- 
labeiled human C3a were kindly provided by Dr. Bitter- 
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Tabic 1. Nucleotide and amloo-acid Identity scores and synonymous (*,) and ooo-sj Donymoos (k ) ctolotionarr rates for the CS* r*r**i~ ^ , ^ 
members of the C-pfotetn-coupled- receptor ramfly eroionocury rates for the C5a receptor and so era I otber 

K a and k M values calculated using LWL85 software (Li & Luo, 1985). Bovine and Rai C5a receptor sequences are partial seoucnc« «,rnrfJ„« r 
transmembrane segments I] to V and JI to VII respectively. or sequences are pa mat sequences extending from 



Evolutionary 
rates 



Identity (%) 



Receptor 



Complement C5A 



Tachykinin NK2 
Dopamine D2 
B2 Adrenergic 

TSH 



Muscarinic Ml 



Species 



HUM 
HUM 
HUM 
DOG 
DOG 
RAT 

HUM 

HUM 

HUM 
HUM 

HUM 
DOG 
DOG 

HUM 
HUM 
HUM 
MUS 
PIG 



DOG 
RAT 
BOV 
RAT 
BOV 
BOV 

RAT 

MUS 

MUS 
RAT 

RAT 
RAT 
HUM 

PIG 

RAT 

MUS 

PIG 

RAT 



kjyczr 



2.7 
6.1 
3.0 
6.1 
2.6 
5.4 

3.7 

2.8 

3.8 
3.3 

3.9 
4.6 
2.5 

2.4 
2.6 
2.9 
4.0 
3.4 



Vycar 



1.3 
1.3 
0.9 
1.6 
1.3 
1.3 

0.5 

0.1 

0.4 
0.4 

0.5 
0.4 
0.4 

0.0 
0.0 
0.1 
0.1 
0.1 



Nucleotide 



75.2 
73.6 
80.5 
72.4 
78.9 
73.2 

86.5 

90.7 

83.0 
81.0 

84.4 
85.1 
89.8 

92.4 
91.4 
90.2 
88.0 
89.4 



Amino acid 



68.6 
68.1 
76.4 
64.3 
68.2 
67.6 

88.4 

96.0 

85.9 
86.4 

85.7 
89.0 
89.7 

99.1 
98.7 
98.0 
97.2 
97.8 



Suermann and coworkers (Jnstitut fur Mcdizinische Mikro- 
biologie, Medizinische Hochschule, Hannover, Germany). 

Binding assays 

All assays were carried out in 5-mI polypropylene tubes in a 
final volume of 100 //I, and incubated for I h with constant 
shaking at room temperature. The assay was initiated by 
the addition of membranes (75-100 //g of protein) to the 
tubes containing 20 mM-Hepes (pH 7.4)/125 mM-NaCI/5 mM- 
KCl/0.5mM-gIucosc/0.25% (w/v) BSA/1 m.M-CaCL/I m.M- 
MgCl 2 and the ligands. Concentration of labelled iigands in 
displacement experiments were 1-5 n.M '"Habclled human C5a 
and I0nM m MabelIed human C3a. Non-specific binding was 
determined by adding an excess of either human C5a (I //m) or 
human C3a (I0/<m). The assay was terminated by ccntrifuging 
(13000 0°C, lOmin) the membranes through a 10% (w/v) 
sucrose cushion in phosphate-buffered saline. The tubes were 
then frozen in liquid nitrogen for 2-3 min, and the bottom of 
each tube containing the membrane pellet was cut with a blade 
and counted in a gamma counter. 

Calcium assay 

CHO cell cultures, preparation for assay, and assay conditions 
were carried out as described previously (Van Sande el ai., 1990). 
Human C5a was used at a concentration of 100-150 n.M. To 
assay the contribution of extracellular calcium, EGTA (1.5 m.M) 
was added to the assay buffer in some experiments. 

Genomic PCR 

Aliquots (I //g) of canine, human, rat and bovine genomic 
DNA were used as a target DNA in PCR reactions (30 cycles of 
1 min at 93 °C, 2 min at 55 °C, 3 min at 72 °C). Other annealing 
temperatures were tested: 52 °C, 50 °C, 48 °C and 45 °C All 
other conditions were as for PCR on the canine thyroid cDNA 
library described above. Primers containing Xbal or /////dill 
restriction sites for cloning were as follows: 



PI = TAGATCTAGATCAA(T/C)GC(G/C)AT(C/A/T)- 
TGGTT(T/C)CT; 

P2 = ACTTAAGCTT(T/G)ATGCAGCA(G/A)TT(C/A/T)- 
AT(G/A)TA; 

P3 = TAG ATCTAG ACTGTTTTCGTCCATCGTCCA ; 
P4 = ACTTAAGCTTACCAC(G/C)ACCTT(C/T)- 
AGTGT(C/T)TT. 

RESULTS AND DISCUSSION 

Numerous discrete bands were obtained in PCR reactions, 
using cDNA from a canine thyroid AgxW library (Lefort et aL 9 
1989) as target DNA, and degenerate primers corresponding to 
the conserved regions of the second, third, and seventh 
transmembrane segments of known G-protein-couplcd receptors, 
as described previously (Libert et aL % 1989; Parmcntier et at. % 
1989). These bands were cloned into the M13 vectors and 
sequenced. Open reading frames presenting similarities with G- 
protein-coupled receptors were searched for in all frames. Three 
clones encoding new putative members of the receptor 
supcrfamily were used as probes to screen the canine thyroid 
cDNA library. DTJP03, a 600 bp PCR clone, produced two 
positive signals out of 10* clones screened. The larger clone 
(1993 bp) contained entirely the smaller one (1.5 kb). Sequencing 
revealed a 1056 bp open reading frame (Fig. I), having two 
potential AUG initiation codons (bases I and 10). The region 
surrounding the first AUG is closer to the consensus for initiation 
sites, as described by Kozak (I9S9). The coding sequence of 
DTJP03 was cloned as a 1350 bp insert into pBluescripl SK + 
and in the pSVL cukaryotic expression vector. 

The cDNA encodes a 352 amino-acid protein (Fig. I) with a 
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(8) 

15 



Fig. 2. Dendrogram showing the relative sequence similarities between the 
C5a receptor from Tour mammalian species, and the related 
receptors from the G-protein-coupled family 

The dendrogram was generated by using the GCG Pileup software. 
Abbreviations: HumC5a, human C5a; BovC5a, bovine C5a; 
RatC5a, rat C5a; DogC5a, dog C5a; Humfmlp, human fMLP 
receptor; Humil8hi, human IL-8 high-affinity receptor; HumiI8Io, 
human IL-8 low-affinity receptor; Rdcl, our orphan receptor (Libert 
etaL, 1 989). 



calculated relative molecular mass of 39186. The hydropathy 
profile (Kyte & Doolittlc, 1982) of the deduced amino-acid 
sequence is consistent with the presence of seven transmembrane 
domains (results not shown). Sequence comparison with the 
recently cloned human C5a receptor (Gerard & Gerard, 1991; 
Boulay et aL, 1991) revealed a 68 % amino-acid identity. This 
percentage is well below the identity scores obtained between 
mammalian orthologues for other G-protein-coupled receptors, 
which generally ranged between 85 and 98%. A few examples 
are given for comparison in Table From the dendrogram 
displayed in Fig. 2, the similarity is of the same order as that 
observed for the two recently cloned high- and low-affinity 
interleukin-8 (IL-8) receptors (Holmes et g/., 1991; Murphy & 
TifTany, 1991). Given this moderate similarity, our first 
hypothesis was that DTJP03 encoded a receptor closely related 
to, but different from the published human C5a receptor. We 
considered that it could encode either a C5a subtype, or the 
presumably related C3a receptor as suggested by the similar 
structure of the two ligands (Greer, 1986). 

In order to assay the binding characteristics of DTJP03, 
membranes were prepared from transfected COS-7 cells tran- 
siently expressing the recombinant receptor. Transfected COS 
membranes exhibited a high binding capacity for '"Mabelled 
human C5a (8000 c.p.m.). Over 80% of the total binding was 
specifically displaced by a 10 3 -fold excess of unlabeled C5a, a 
lO-'-foId excess of unlabelled C3a being without effect (Fig. 3a). 
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Fig. 3. Binding studies on the canine C5a receptor expressed in COS-7 and 
CHO cell lines 

(a) Binding of '"I-labclled human C5a (huC5a) and ,3 M-Iabclled 
human C3a (huC3a) to COS-7 cells transiently expressing DTJP03, 
and to control COS-7 cells. Bound radiolabeled li;;and (open bars) 
was displaced by an excess of either unlabellcJ hr.C5a (1 /im) (■) or 
huC3a (10 fiM) (E3)- (b) A similar experimi r .i was performed on 
CHO cells stably transfected with DTJP03. *s compared with CHO 
cells transfected with the pSV,Neo plasmic. a'one (JP02 control), (c) 
Sftturation binding experiment performed cm COS-7 cells transiently 
expressing the canine C5a receptor. Total (T), specific (#) and non- 
specific binding (A) are represented. The ron-ipecific binding was 
determined in the presence of 1 //M unlabelltd huC5a. Curve fitting 
using a non-linear regression algorithm yielded an apparent K a of 
1.2 nM. 



No specific binding of ,25 I-Iabelled human C3a \v;is obtained 
with transfected cells. Control COS cell membranes did not 
display significant binding for either C5? (700 c.p.m.) or C3a. 
Saturation curves obtained with transfected COS cell membranes 
and increasing concen'ratior.i of m l-labelled human C^a 
demonstrated a saturable high-affinity binding site with an 
apparent K„ of 1.2 :jM (Fig. 3c), this being within the range 
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Fig. 4. Intracellular calcium measurements in staWy transfected CHO cells 
and JP02 control CHO cells 

Experiments were performed using a Quin-2 fluorescence assay and 
a SPEX fluorimcter. (a) Panels A. B and C represent respectively the 
responses elicited by the successive addition of human C5a (huC5a) 
(100 nM) and ATP (10//M) on a CHO cell line transfected with 
pSV 2 Neo (JP02 control), and two clonal cell lines (nos. 11 and 14) 
transfected with DTJP03, and expressing the C5a receptor with a 
different efficiency, {b) Effect of the extracellular calcium con- 
centration on the response to C5a in clone no. 14. The dotted line 
represents the response to huC5a (100 nM) in control medium 
containing I mM-Ca 2 \ The continuous line represents the response 
to huC5a after addition of an extracellular excess of EGTA (1.5 m\i). 



previously described for the human neutrophil and macrophage 
C5a receptors (Chenowcth et oI. % 1982; Huey & Hugh, 1985; 
Yancey et «/., 1989). 

CHO-K1 cells were co-transfectcd by the pSVL/DTJP03 
construct and the pSV 2 Neo vector conferring resistance to 
neomycin. After selection by G4I8. the p?oI of resistant clones 
was used to prepare membranes. Specific binding of I,3 I-labclled 
human C5a to stably transfected CHO cell-membrane 
preparations yielded results similar to those obtained with COS- 
7 cells (Fig. 3b), while C33 was unable to bind to the receptor. 
The pool of neomycin-resistant CHO cells was cloned by high- 
dilution-culturc seeding followed by subsequent recovery of the 
isolated colonies. The individual clones were screened by 
measuring intracellular calcium concentrations by the Quin-2 
fluorescence assay (Van Saode et <//., 1990). Only 25 ° 0 (6/24) of 
the clones responded to l50n.M-human C5a in the Ca 2f assay. 
One clone (No. 14) presented a large response to C5a, as 
compared with that elicited by 10//M-ATP, used as positive 
control (Fig. 4a). This clone was used for further studies. 
Addition of 1.5mM-EGTA to the extracellular assay medium 
modified the calcium fluorescence signal induced by subsequent 
addition of the agonist (Fig. 4b). In the presence of extracellular 



calcium, the intracellular calcium concentration increased rapidly 
to its maximal level within I min, then decreased gradually back 
to the basal level or slightly higher. However, when extracellular 
calcium was chelated by excess EGTA, human C5a evoked a 
sharp transient increase of intracellular calcium, that rapidly fell 
off to beneath the basal level. As the initial increase in calcium 
concentration is resistant to extracellular calcium depletion by 
EGTA, it is likely that calcium is released from intracellular 
stores. On the other hand, the subsequent decrease in calcium 
concentration is faster in the absence of extracellular calcium, 
suggesting that calcium influx contributes to the sustained phase. 
This behaviour is typical of receptors coupled to a G-protein 
activating the inositol phosphate cascade (Kojima, 1990; Dore et 
aL 1990). 

These results clearly demonstrate the DTJP03 is a canine high- 
affinity C5a receptor that functionally couples to the lnsP 2 - 
calcium cascade in stably transfected CHO cell lines. Although 
there is no pharmacological evidence for C5a receptor subtypes, 
the low degree of similarity between the canine and the human 
receptors could be indicative of a possible molecular hetero- 
geneity of the C5a receptor. We therefore searched for 
orthologues of our canine C5a and of the human C5a receptors 
in four mammalian species: human, dog, rat and cow. A PCR- 
based approach was used, in which four nucleotide sequences, 
conserved between the cloned canine and human C5a receptors, 
were used as primers to amplify related gene fragments from 
genomic DNA. Like most of the other members of the seven 
transmembrane G-protein-coupled-receptor superfamily, the 
C5a receptor lacks introns in its coding sequence. Four mod- 
erately degenerate primers corresponding to parts of 
transmembrane segments II (PI and P2), VI (P3) and VII (P4) 
were defined (see the Materials and methods section). Inde- 
pendent PCR reactions were performed, using all four primer 
combinations (i.e. PI versus P2; PI versus P4; P3 versus P2 and 
P3 versus P4). To allow primer hybridization in the presence of 
potential mismatches, several annealing temperatures (55 °C, 
52 °C, 50 °C, 48 °C, and 45 °C) were used. Down to 45 °C, only 
one band was visible for each primer combination at the expected 
size (results not shown). These bands were cloned into the 
bacteriophage M13mpl8 and 19 vectors and sequences were 
obtained from five or six clones under each condition. All 
sequences obtained from canine DNA were identical to DTJP03; 
likewise all sequences from human genomic DNA were identical 
to the published human C5a receptor sequence. The bovine and 
rat sequences were unique as well, regardless of PCR stringency 
or primer combination. The partial amino-acid sequences 
obtained from rat and bovine sequences were aligned with the 
canine and human sequences (Fig. 5). Similarity between any two 
sequences was close to 70 ° 0 . These results demonstrate that our 
canine receptor and the reported human C5a receptors effectively 
represent orthologues. 

There remains then the question of the surprisingly low 
interspecies amino-acid conservation (68.9 + 3.7 ° 0 ), which is in 
contrast to the high conservation between other G-protcin- 
couplcd receptors, such as the cannabinoid, the thyroid stimu- 
lating hormone or the muscarinic receptors. This contrast 
appears clearly when the evolutionary rates for synonymous 
changes (k % ) and non-synonymous changes (A*J (Li & Luo, 1985) 
are calculated for the C5a receptors and several other members 
of the G-protein-coupled-receptor family, as well as the nucleo- 
tide and amino-acid identities (Table 1). As expected, the k\ 
values, reflecting the rate of nucleotide substitutions that do not 
affect amino acids, are relatively constant for all receptors, as it 
reflects solely the evolutionary distance between species. On the 
other hand, the k\ values, reflecting the nucleotide substitution 
rate affectiDg the amino-acid sequence, are low for most 
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* * VRXVLC rLWyiXTXT 1 CTTP IIXRTW5R R ATRSTXT1. K VWA WAS rr T rwi .Tmmiiwwgrtfpt spT - 

8o»C5» WATC KDREYAEXAV*. 

HuftiMJ TFNrSPWTHDPKtRIKVAVAJXTVRCI 



ClIRJIICfSAP^lVAVSTCl.IATKIHKQCLiKSSRPli^rvjLA^ 
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fUtCS* 
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.VII . 
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OS*" * *VERl5sLCVSL AT1 " CClir * 11 WJ ^ r ° CW ' R ^ FSl ^ WT " SVWS * S "« 
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C RRNN I CRALD ATE I l-GFLHSCL* PI ITA7 1 GQSTRHGrLX I LA . . KHGLVSKEFLARHRVTSY . TSSSVNVSStfL 
ERRNHlDRALOATEILCILMSClJlPLXIAriCQRrRHCLLXlLA. . IHCLISKDSLPKOSRPSFVCSSSGKTSnX 



Fig. 5. Amino-actd ab'gnment of the canine and human C5a receptors, the partial rat and bovine sequences obtained from genomic DNA bv PCR, and the 
sequences from the related chemoartractant receptors to human IL-8 and human fMLP 

Amino acids identical to the canine C5a receptor are represented in bold characters. Numbering is relative to the canine C5a receptor amino-acid 
sequence. The transmembrane segments are ovcrlined and numbered 1 to VII. The putative sites for AMinked glycosylation in the A'-terminal and 
extracellular domains are underlined twice; the putative sites for phosphorylation by protein kinases A and C in the second and third ICLs arc 
underlined once. The alignment was performed by using the GCG Pileup software. See legend to Fig. 2 for key to abbreviations. 



G-protein-coupled receptors, and significantly higher for the C5a 
receptor. A corollary of these high A* a values is the lower identity 
at the amino-acid level as compared with the nucleotide level in 
contrast to the increase observed for the other members of the 
G-protein-coupIed family. This signifies that evolutionary con- 
straints afl*ect quite differently classes of receptors that are 
believed to share a common transmembrane organization and 
structure-function relationships. It is unclear to us why the C5a 
receptor appears to deviate from the strong conservation pre- 
vailing in this gene family. 

Northern-blot analysis (results not shown) on different canine 
tissues detected C5a receptor transcripts in thyroid (but not in 
cultured thyrocytes), testis, brain, lung, kidney, spleen and 
stomach. This partial distribution, as well as the cloning from a 
thyroid library, reflects the presence of circulating leucocytes and 
tissue macrophages in all organs. 

Within this background of high interspecies variability, the 
similarities between protein sequences could pinpoint 
functionally important conserved segments or residues, 
implicated either in ligand-receptor interactions, G-protein- 
coupling or receptor desensitization. Wv therefore aligned the 
C5a receptor sequences with their closest relatives of the G- 
protein-coupled family, the human formyl Met-Lcu-Phe (fMLP) 
(Boulay et a/., 1990) and the two IL-8 receptors (Fig. 5). The 
extracellular AMerminal domain is poorly conserved between the 
canine and human receptors with only 44% (17/39) identical 
residues. It has been proposed that the high negative charges of 
the receptor are important for the interaction with the positively 
charged ligand, although few of these charges are on the surface 
of the C5a ligand (Zuiderweg et at., 1989; Mollison et at., 1989). 
Within the A'-terminal domain of the receptor, only three charged 
residues are identical in both species, and the same observation 
prevails for the extracellular loops that are the least conserved 
parts of the receptor. This poor conservation, together with the 
absence of species selectivity for the ligand (the canine receptor 
interacts very efficiently with human C5a), do not give support to 



the involvement of the extracellular domains in receptor-ligand 
interactions. 

Interestingly, in contrast to the human C5a receptor, the 
canine receptor does not present potential iV-glycosylation sites 
in its extracellular A'-terminus. The conserved Asn-Xaa-Thr 
sequence is, in the canine sequence, followed by a proline, which 
has been shown to prevent glycosylation (Pless & Lennarz, 1977; 
Bause, 1983). Such differential glycosylation is also observed for 
the C5a ligand which is glycosylated in human (Zuiderweg et a/., 
1989) but not in pig (Williamson & Madison, 1990). The 
intracellular C-terminus contains numerous serine and threonine 
residues (10 for canine; 11 for human) that are potential 
phosphorylation sites for/? \RK-related serine/threonine protein 
kinases (Benovic et aL, 1989). Another interesting feature of the 
C-terminus, shared with the bovine NPY (Rimland et ai, 1991), 
human IL-8, RDCI (Libert et al. t 1989), and the human fMLP 
receptors, is the absence of a cysteine residue, conserved among 
most G-protein-coupIed receptors, thought to serve as a site for 
palmitoylation (Dohtman et 1991). 

Potential phosphorylation sites, conserved in the four species, 
are present in the third intracellular loop: one for the cyclic 
AMP-dependcnt kinases (protein kinase A) (Feramisco et ct/ m 
1980; Glass et n/., 1986) at residue Thr-237, and three for the 
protr-a kinase C (Woodgett et «/., 1986) at residues Scr-233, Scr- 
239 and Thr-242. Conserved residues in the C5a receptors include 
Phc-136 which replaces the tyrosine residue of the Asp-Arg-Tyr 
tri-peptidc motif (end of TM3) common to most G-protein- 
coupled receptors. 

The transmembrane segments are the most conserved, with 
73% identity (119/162); however, the intracellular loops (ICL) 
are also highly conserved with 70% (31/44) of identical residues. 
The lowest identity is observed for ICL2 (58%, 11/19). ICL3 
presents up to 87% (13/15) amino-acid identity within the C5a 
receptor group. The C5a, fMLP and IL-8 receptors share very 
little similarity within ICL3, despite the coupling of these 
receptors to a common intracellular pathway, and the involvc- 
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mcnt of ICL3 in G-protcin coupIin^Dohlman et ul. % 1991; 
Bonner, 1992). Charged residues thought to interact with negative 
charges of G-protcins v^ohlman vt «/., 1991) arc, however, 
present in all cases. 

As a conclusion, cloning of the C5a receptors in several species 
along with the characterization of the corresponding ligands 
should allow a better approach in determining critical amino 
acids implicated in reccptor-ligand interaction, G-protcin-coup- 
ling and desensitization. It will pinpoint candidate residues for 
mutagenesis and chimeric constructions, both in the receptors 
and the ligands. Ultimately, the availability of the cloned 
receptors should help the design of pharmacologically active 
(non-peptide) inhibitors that could be used in syndromes were 
inappropriate complement activation occurs. 
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In a human neutrophil cDNA library, an orphan G- 
protein-coupled receptor, HNFAG09, with 37% nucleo- 
tide identity to the C5a receptor (C5a-R, CD88) was iden- 
tified. A novel feature of this gene, unlike C5a-R and 
other G-protein-coupled receptors, is the presence of an 
extraordinarily large predicted extracellular loop com- 
prised of in excess of 160 amino acid residues between 
transmembrane domains 4 and 5. Northern blot analysis 
revealed that expression of mRNA for this receptor in 
human tissues, while similar, was distinct from C5a-R 
expression. Although there were differences in expres- 
sion, transcripts for both receptors were detected in 
tissues throughout the body and the central nervous 
system. Mammalian cells stably expressing HNFAG09 
specifically bound 125 I-C3a and responded to a C3a 
carboxyl- terminal analogue synthetic peptide and to 
human C3a but not to rC5a with a robust calcium mo- 
bilization response. HNFAG09 encodes the human ana- 
phylatoxin C3a receptor. 



During complement activation the 74-77-amino acid ana- 
phylatoxins C3a, C4a, and C5a are released. They are potent 
inflammatory mediators, inducing cellular degranulation, 
smooth muscle contraction, arachidonic acid metabolism, cyto- 
kine release, and cellular chemotaxis (reviewed in Refs. 1-3), 
and have been implicated in the pathogenesis of a number of 
inflammatory diseases (4, 5). 

Studies have demonstrated the presence of a C3a receptor 
(C3a-R) on guinea pig platelets, rat mast cells, human neutro- 
phils, eosinophils, and platelets (3). A single class of high af- 
finity C3a binding sites has been characterized on human 
neutrophils and differentiated U937 cells (6). Competition 
binding and functional desensitization studies are consistent 
with the presence of a receptor for C3a, which is distinct from 
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the C5a-R (3, 6). However, there is evidence that C3a and C4a 
may bind to the same receptor as the two anaphylatoxins 
cross-desensitize guinea pig ileal tissue (2, 3), although other 
investigators using guinea pig macrophages indicate that there 
may be separate receptors (7). Functional activity of the C3a-R 
is sensitive to pertussis toxin, consistent with the binding site 
being composed of a G-protein-coupled receptor (6). 

A complete understanding of the role of C3a in the patho- 
genesis of the inflammatory response has been hampered by 
the lack of the cloned receptor. In this report we describe the 
molecular cloning, stable expression in mammalian cells, and 
functional characterization of the human C3a receptor. This 
same receptor was recently independently cloned from an 
HL-60 library by low stringency screening with a fMet-Leu-Phe 
receptor probe and, lacking functional data, claimed to be an 
orphan receptor (AZ3B) (8). Mouse L cells expressing AZ3B 
failed to bind and respond to the agonists examined, although 
C3a was not tested (8). 

EXPERIMENTAL PROCEDURES 

Materials — The C3a carboxyl- terminal analogue synthetic peptide 
(WWGKKYRASKLGLAR) (9) was obtained from Bachem Bioscience, 
Inc., King of Prussia, PA. C3a was purchased from Advanced Research 
Technologies, San Diego, CA. Human rC5a was expressed in Esche- 
richia coli and purified to homogeneity. Other agonists were obtained 
from Sigma. 

cDNA Cloning — cDNA library construction and screening were car- 
ried out essentially as described (10), and DNA sequence was deter- 
mined using a ABI sequencer (11). Expressed sequence tag analysis 
(11-13) of cDNA clones derived from a human neutrophil (lipopolysac- 
charide activated) cDNA library (oligo(dT)-primed and constructed in 
the A Uni-ZAP XR vector (Stratagene)) identified a clone demonstrat- 
ing significant homology (approximately 40% amino acid sequence 
identity) to the C5a-R (14, 15). This cDNA clone contained an incom- 
plete open reading frame and therefore was used to reprobe the 
neutrophil cDNA library to obtain a full-length cDNA. The alignment 
of HNFAG09 and the C5a-R was determined by the method of Needle- 
man and Wunsch (21) using the Gap comparison program of the 
Wisconsin Package, version 8, September 1994, Genetics Computer 
Group, Madison, WI. 

Northern Blot Analysis — Commercially prepared (Clontech, Palo 
Alto, CA) multiple tissue blots containing approximately 2 fig of poly(A) 
mRNA per lane were sequentially hybridized with random primer 32 P- 
labeled cDNAs spanning the coding regions of C5a-R and HNFAG09. 
C5a-R was cloned via PCR 1 from differentiated U937 RNA. The final 
washing step was carried out twice in 0.5 x SSC, 1% SDS at 65 °C for 
20 min. 

Stable Expression in RBL-2H3 Cells—To prepare HNFAG09 for ex- 
pression in mammalian cells, a 1.6-kb cDNA fragment was obtained by 
PCR amplification that encompassed the entire HNFAG09 open read- 
ing frame. This fragment was subcloned into KpnVHindUl sites of the 
mammalian expression vector, pCDN (16). Oligonucleotide primers 
used for PCR amplification were 5'-GA AGT GGT ACC ATG GCG TC 
-3' and 5'- GC TCC AAG CTT TCA CAC AGT TG -3' (the translation 
start and stop codons are underlined). RBL-2H3 cells were electropo- 
rated with either HNFAG09 or C5a-R in the pCDN mammalian expres- 
sion vector (16), exactly as described (17). Individual G4 18- resistant 
(400 ng/m\) colonies were isolated and expanded. Clonal cell lines 
expressing either HNFAG09 or C5a-R were chosen for further func- 
tional and binding studies. 

Calcium Mobilization — Fura 2-loaded clonal cell lines expressing 
C5a-R or HNFAG09 were assayed for functional response, Ca 2 * mobi- 
lization, as described (18). 

Binding Assay — C3a was radioiodinated using 10DO-BEADS 



1 The abbreviations used are: PCR, polymerase chain reaction; kb, 
kilobase; PBL, peripheral blood leukocytes. 
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- 1 52 CACGACMAGAACACAAGAAGAGAAACCTCACCAAATTTTCTTGCCATACTTCATGACTTC 

- 9 2 ACTCT?GGCTAACTCTGGGGACCAGACACCACTCGTCGAGACATCCAGGTGCTGAACC^ 

- 3 2 CACCTACTGTCTCAG!'. IVl'V 1WIAGTT7AGCAATGCCGTCTTTCTC7GCTGAGACCAATT 

MASFSAETtfS 
29 CAACTX^CCTACTCTCACAGCCATGGAATGAGCCCCCAGTAATTCTCTCCATCGTCATTC 

T D L L S OPWNEPPVILSHVIL 
8 9 TCAGCCTTACT1' ri'I'TACTGGGATTX5CyAGGCAATGGGCTGGTG C "l\jT<3GGTGGC*?GGCC 

• LTFLLOLPOBOIVLWVAGL 
149 TGAAGATGCAGCGGACACrTGAACACAArnXJCrrCCTCCAC^ 

It K 0 R T V NT IHFL1LTLADLI 
209 TCTGCTCCCTCTCCTTGCCCTICTCGCTCGCTCAC^^ 

CCL* LPFSIiAHLALQGQWPY 
269 ACGGCAGGTTCCTATGCAAGCTCATCCCC?CCATCATTG7CCTCAACATGTTTGCCAGTG 
_ C RFLCKLIpflllVLBHFAlV 
329 T'CTTCCTGCTTACTGCCATTAGCCTGGATCGCTGTCTTXn'GGTATTCAAGCCAATCTGGT 

FLliTAZSLORCLVVFKPlHC 
389 GTCAGAATCATCGCAATGTAGGGATGGCCTGCTCTATCTGTGGATGTATCTGGGTGGTXKI 

ONHRNVGHACCXCOCZirVVA 
44 9 CTTGTGTGATGTGCATTCCTGTGTTCCTGTACCGGGAAATCTTCACTACAGACAACCAT^ 

CVMC IFVFVTREI FTTDNHM 
509 ATAGATGTGGCTACAAATTTGGTCTCTCCAGCTCA77AGATTATCCAGACTTTTATGGAG 

RCGYKFGLSSSLDYPDFYGD 
569 ATCCACTAGAAAACAGGTCTCrTGAAAACATTGt^CAGCCGCCTCGAGAAATGAATGATA 

PLEffRSLENIVQPPCEHNDR 
629 GGTTAGATCCTTCC1^'1'*'I1.'CAAACAAATGATCATCCTT < GGACAGTCCCCACTGTC'1*1\X' 

LOPSSFOTNOHPWTVPTVFO 
669 AACCTCAAACATTTCAAAGACCTTCTGCACATTCACTCCCT^ 

POTPQRPSADSLPRGSARLT 
749 CAAGTCAAAA1CTGTATTCTAATGTATTTAAACCTGCTGATGTGGTCTCACCTAAAATCC 
SOMLYSNVFKPADVVSPKIP 
809 CCAGTGGGTTTCCTATTGAAGATCACGAAACCAGCCCACTGGATAACTCTGATGCTTTTC 
SGFP I EDHETSPLDNSDAFL 
869 TCTCTACTCATTTAAAGCTCTTCCCTACX^rTCTAGCAATT 

STHLKLFPSASSKSFYESEL 
929 TACCAC AAGG7TTCCAGG ATTATTACAATTTAGGCCAATTCACACATGACGATC AAGTGC 
PQGFODYYNLCOFTDDDOVP 
989 CAACACCCCTCGTGGCAATAACGATCACTAGGC7AGTGGTGGG7TTCCTGCTGCCCTCTC 
TPLVAITITRLVVOFLLPBV 
1 049 TTATCATGATAGCCTGTTACAGCTTCATTGTCITCCGAATXSCAAAGGGGCCGCTTCGCCA 
IMIACTBFIVFRMQRGRFAK 
1 1 09 AGTCTCAGAGCAAAACCTTT^AGTGGCCGTGGTGGTGCTGGCTGT C i 1' 11 i ' l Vll ' lti CT 
S Q S K T f R V * V VVVAVFL**VCW 
1169 GGACTCCATACCACATTrTTGGAGTCCTXJTCATTGCTTACTGACCCAGAAACTCCCTTGG 
*»YBIFOVLSLLTDPETPLC 
1229 GGAAAACTCTGATCTCCTGGGATCATGTATGCATTGCTCTAGCATCTGCCAATAGTTGCT 
KTLKSWDHVCIALA8AB0CF 
1289 TTAATCCCrrcCTTTATGCCCTCTKX^yU^ 

■ PFtTALLGRDFRKKAROSI 
1349 TTCAGGGAATTCTGGAGGCAGCCTTCAGTGAGGAGCTCACACGTTCCACCCACTGTCCCT 

QG1 LEAAFSEELTRSTHCPS 
1409 CAAACAATGTCATTTCAGAAAGAAATAGTACAACTGTGTGAAAATGTGGAGCAGCCAACA 

KNVI SERHSTTV* 
1 4 69 AGCAGGGGCTCrrAGGCAATCACATAGTGAAAGTTTATAAGAGGATGAAGTGATATGCTC 
1529 AGCAGCGGACTTCAAAAACTGTCAAAGAATCAATCCAGCGGTTCTCAAACGGTACACAGA 
1569 CTATTGACATCAGCATCACCTAGAAACTTGTTAGAAATGCAAATTCTCAAGCCGCATCCC 
1 64 9 AGACTTGCTGAATCGGAATCTCTGGGGGTTGGGACCCAGCAAGGGCACTTAACAAACCCC 
1709 CGTTTCTGATTAATGCTAAATGTAAGAATCATTGTAAACAT7AGTTCTATTTCTATCCCA 
1769 AACTAAGCTATGTGAAATAAGAGAAGCTACTTTCTTTTTAAATGATGTTGAATATTTGTC 
1829 G ATATTTCCATC ATT AAATTTTTCCTT AGCATTGTCT AAGTCAAAAAAAAAAAAAAAAAA 
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Fig. 1. A, nucleotide and deduced amino acid sequence of HNFAG09. 
The predicted seven-membrane spanning domains of HNFAG09 are 
indicated by bold, and glycosylation sites are indicated by italics and 
underline. This nucleotide sequence has been submitted to GenBank; 
the accession number is U62027. B, predicted membrane topology 
of HNFAG09. Amino acid residues in common between C5a-R and 
HNFAG09 have been highlighted black; two predicted AMinked glyco- 
sylation sites, in the large extracellular loop and the amino terminus, 
are indicated by gray shading. 



(Pierce) to a specific activity of 100 Ci/mmol. Increasing concentra- 
tions of cold competitor were added to 1 x 10 6 cells in the presence of 
125 I-C3a (2.3 nM), and the assay was performed essentially as 
described (6). 



C5aR 



HNFAG09 




Fig. 2. C5a-R and HNFAG09 transcripts are abundantly ex- 
pressed in the central nervous system and throughout the body. 

Tissue distribution of C5a-R and HNFAG09 as determined by Northern 
blot analysis. The tissue source of RNA is indicated above each lane. 



RESULTS AND DISCUSSION 

Expressed sequence tag analysis (11-13) of cDNA clones 
derived from a human neutrophil (lipopolysaccharide acti- 
vated) cDNA library identified a clone demonstrating signifi- 
cant homology (approximately 40% amino acid sequence iden- 
tity) to the C5a-R. This expressed sequence tag contained an 
incomplete open reading frame that therefore was used to 
reprobe the neutrophil cDNA library to obtain a 2040-base pair 
cDNA encoding a complete orphan G-protein-coupled receptor 
of 482 amino acids, which shared 37% nucleotide identity 
throughout the coding regions with the C5a-R (Fig. LA). Al- 
though similar to the C5a-R, this cDNA contains two predicted 
extracellular AMinked glycosylation sites and an unusually 
large extracellular domain between transmembrane domains 4 
and 5 comprised of over 160 amino acid residues (Fig. LA). The 
majority of the identical residues between the C5a-R and 
HNFAG09 reside in the predicted transmembrane spanning 
domains and in the second intracellular loop (Fig. LB). 

By Northern blot analysis, expression of HNFAG09 in hu- 
man tissues and cell lines is distinct from C5a-R expression. An 
-2.2-kb C5a-R transcript was abundantly expressed in periph- 
eral blood leukocytes (PBL), lung, spleen, heart, placenta, spi- 
nal cord, and throughout the brain. An -2.1-kb HNFAG09 
transcript was predominantly expressed in lung, spleen, ovary, 
placenta, small intestine, throughout the brain, and to a much 
lesser extent than C5a-R, in heart and PBL (Fig. 2). Although 
by Northern blot analysis the specific cells within the various 
tissues examined, which are expressing C5a-R and HNFAG09, 
cannot be determined, these data are suggestive that these 
receptors are abundantly expressed throughout the body. By 
fluorescent activated cell sorting using polyclonal antibodies 
generated to fusion proteins composed of glutathione S-trans- 
ferase or maltose binding protein and the extracellular loop, 
this receptor has been shown to be expressed on several cell 
types, including U937, HL-60, PBL, and human neutrophils 
and monocytes (8). 

Preliminary functional characterization in Xenopus laevis 
oocytes suggested that HNFAG09 encoded a human anaphyla- 
toxin receptor. 2 To confirm these results in mammalian cells, 
this receptor was expressed in RBL-2H3 cells (19), a rat baso- 
phil cell line, which when transfected with an expression plas- 
mid encoding the C5a-R expresses receptors that are function- 
ally active (17). RBL-2H3 cells were stably transfected with 
mammalian expression plasmids encoding the C5a-R or 
HNFAG09, and Fura 2-loaded cells were tested for a C5a- or 
C3a-induced mobilization of intracellular Ca 2 "\ C5a-R- but not 

2 R. S. Ames, P. Nuthulaganti, and C. Kumar, unpublished 
observation. 
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Fig. 3. Cells expressing HNFAG09 but not C5a-R bind and 
respond to C3a. Calcium mobilization by Fura 2-loaded cells express- 
ing C5a-R (A and C) or HNFAG09 (B and D) in response to rC5a < 10 nM 
(A) or 100 nM (B)) or C3a analogue peptide (1 pM, C and D) is shown. £, 
competition of 125 I-C3a binding to HNFAG09 expressing RB1^2H3 cells 
by increasing concentrations of C3a analogue synthetic peptide (closed 
circle), C3a {open square), or rC5a (open triangle). 
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HNFAG09-expressing cells responded to rC5a (Fig. 3, A and B). 
A robust response to a C3a carboxyl-terminal analogue syn- 
thetic peptide (WWGKKYRASKLGLAR) (9) (EC 50 = 3.9 nM) 
was detected in cells expressing HNFAG09, but no response 
was obtained for C5a-R-expressing cells (Fig. 3, D and C, re- 
spectively). Similarly, HNFAG09 but not C5a-R expressing 
RBL-2H3 cells also responded to native human C3a (ECso = 

0. 3 nM; data not shown). 

C3a was radioiodinated and used in whole cell binding as- 
says to further characterize HNFAG09. Binding of 125 I-C3a to 
HNFAG09 expressing RBL-2H3 cells was competed by increas- 
ing concentrations of C3a (IC 50 = 3.0 nM) and the C3a analogue 
synthetic peptide (IC^ = 155 nM) but not by rC5a (Fig. 3E). By 
saturation binding and Scatchard analysis a single class of C3a 
binding sites was identified with an estimated K d of 0.3 nM and 
a B max of 32,000 receptors/cell (data not shown). Curiously, a 
HEK 293 cell line stably expressing HNFAG09 mRNA by 
Northern blot neither bound nor responded to C3a (data not 
shown). 

RBL-2H3 cells expressing HNFAG09 bind and respond to 
C3a and a C3a analogue synthetic peptide but not C5a. These 
data, along with the results of the tissue distribution analysis, 
are consistent with HNFAG09 (AZ3B) (8) encoding the human 
C3a receptor. 

The demonstration that C5a-R (reviewed in Ref. 20) and 
C3a-R expression is not limited to myeloid cells but that they 
both are expressed in a variety of non-myeloid cells throughout 
the body and that they are abundantly expressed in the central 
nervous system is consistent with these receptors having a 
much greater role in the pathogenesis of inflammatory and 
autoimmune diseases than previously suspected. Now that the 
receptor for C3a has been identified, further studies to eluci- 
date the role of C3a in immune function and disease will be 
facilitated. 
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